Methodology

How we built this dataset

A plain-English walkthrough of where the OpenPayrolls Texas data comes from, how it's cleaned, what we publish, and what we deliberately leave out.

1. The source of record

Every salary record on OpenPayrolls Texas is derived from a publicly available payroll snapshot of the State of Texas workforce. The specific file we mirror is the non-duplicated employees CSV released by the Texas Tribune as part of its Texas Tribune Government Salaries Explorer. The Tribune in turn obtains the data from the Texas Comptroller of Public Accounts under the Texas Public Information Act. The release for the period reflected in this site is dated 2026; the file URL is preserved in our seed script for reproducibility.

We chose this source because it is already de-duplicated at the row level (people who hold multiple positions appear once with their primary role) and because it carries the same provenance the rest of the Texas reporting community relies on. If you reproduce our numbers, this is the file to start with.

2. What we filter out

We exclude two kinds of records before the data is published on this site:

Hidden-from-search records. The source file flags a small number of employees whose names should not appear in public search interfaces, typically because their roles are exempt from disclosure under Texas Government Code Chapter 552. We honor that flag and drop the row entirely.
Duplicate rows. Where the source carries a duplicate flag (a multiple-position holder whose other rows we would otherwise also pick up), we keep only the canonical, non-duplicated row.

3. How we choose which employees to publish

The full underlying release contains roughly 155,000 records, which is more than necessary for a browsable site and would slow page loads to a crawl on cheap hosting. We publish a curated working set of approximately 6,000 employees that combines two slices of the source data:

The 1,500 highest-paid employees, so every agency and job-title page surfaces meaningful top earners.
A random sample of the remaining records, so the rest of the dataset accurately reflects mid- and low-pay roles, not just the C-suite.

This is not the same as publishing the full dataset, and we say so plainly. If you need every record for an investigation, the original Tribune CSV is one click away from our seed script. We make no claim to be a comprehensive raw archive; we are a browsable lens on top of one.

4. What "annual pay" means

The annual-pay number on every employee page is the annualized base salary at the time of the snapshot. For salaried employees that is the monthly base rate × 12. For hourly employees it is the hourly rate × scheduled hours × 52. It does not include:

Overtime actually worked
Hazardous-duty or longevity pay (paid separately under Chapter 659)
Performance bonuses, retention bonuses, or settlement payments
Health insurance, retirement contributions, or other benefits
Supplemental compensation paid from non-state funds — including foundation salary supplements at universities, athletics revenue at major university programs, federal grant supplements, or fee-supported program incentives

For the great majority of Texas state employees, base pay is the dominant component of total compensation. For senior university administrators, head coaches, and certain regulated medical positions, base pay can be a small fraction of total earnings — and you should treat the number on this site as a floor, not a ceiling, for those roles. When in doubt, consult the agency's annual financial report.

5. Aggregations

Every agency page and job-title page shows three derived statistics: average pay, highest pay, and lowest pay. These are computed only across the records actually published on this site (the curated working set described above), not across the full Tribune file. The numbers are stable per release: they will not drift between page loads.

Sector pages additionally show a sector-wide average across every agency in that sector. Sectors are assigned heuristically from the agency name (universities and A&M campuses to "Higher Education", anything containing "Police" or "Public Safety" or "Criminal Justice" to "Public Safety", and so on). The sector grouping is editorial; if a particular agency feels misclassified to you, please flag it via our contact page.

6. URL design

Slugs are derived from the published agency or title text by lowercasing, replacing non-alphanumeric runs with hyphens, and trimming. Where the source produces collisions (two distinct employees with identical names), we append a numeric suffix in seeding order to keep URLs unique. Slugs are stable per release; they may drift between releases if the State of Texas renames an agency.

7. Refresh cadence

The State of Texas releases a new payroll snapshot roughly twice a year. We refresh OpenPayrolls within two weeks of each new release. Dates are visible in the dataset metadata in the footer of every page.

8. Reproducibility

The full code that builds this site is open. The PHP seed.php script in our repository reads the public Tribune CSV, applies the filters and aggregations described above, and writes a small set of JSON files to disk that the page templates render from. Anyone with a copy of the seed script and the source CSV can regenerate every page on this site byte-for-byte (modulo the random sample in step 3, which is seeded for reproducibility in our build).

9. Corrections

Because we mirror a publisher of record, corrections must originate upstream. If a name, title, or pay figure is wrong on this site, it is almost certainly wrong in the underlying state release as well. The right path is to contact the agency's HR office and the Comptroller's open-data team, who will republish a corrected snapshot in the next cycle. We will pick up the correction automatically. If you believe a record should be suppressed entirely for a safety reason, please contact us and we will work with the upstream publisher.

Methodology last updated: May 2026.