HR Data Collection

Data collection is the process of gathering and measuring information on targeted variables in an established system, which then enables one to answer relevant questions and evaluate outcomes.

HR departments have a tradition of collecting vast amounts of HR data. Unfortunately, this data often remains unused. As soon as organizations start to analyze their people problems by using this data, they are engaged in HR analytics.

By using HR analytics you don’t have to rely on gut feeling anymore. Analytics enables HR professionals to make data-driven decisions. Furthermore, analytics helps to test the effectiveness of HR policies and different interventions.

Broadly, the data required by an HR analytics tool is classified into internal and external data. One of the biggest challenges in data collection is the collection of the right data and quality data.

Internal data

Internal data specifically refers to data obtained from the HR department of an organization. The core HR system contains several data points that can be used for an HR analytics tool. Some of the metrics that an HRIS system contains includes:

  • Employee tenure
  • Employee compensation
  • Employee training records
  • Performance appraisal data
  • Reporting structure
  • Details on high-value, high-potential employees
  • Details on any disciplinary action taken against an employee

The only challenge here is that sometimes, this data is disconnected and so may not serve as a reliable measure. This is where the data scientist can play a meaningful role. They can organize this scattered data and create buckets of relevant data points, which can then be used for the analytics tool.

External data

External data is obtained by establishing working relationships with other departments of the organization. Data from outside the organization is also essential, as it offers a global perspective that working with data from within the organization cannot.

  • Financial data: Organization-wide financial data is key in any HR analysis to calculate, for instance, the revenue per employee or the cost of hire.
  • Organization-specific data: Depending on the type of organization and its core offering (product or service), the type of data that HR needs to supplement analytics will vary.
  • Passive data from employees: Employees continually provide data that is stored in the HRIS from the moment they are approached for a job. Additionally, data from their social media posts and shares and from feedback surveys can be used to guide HR data analysis.
  • Historical data: Several global economic, political, or environmental events determine patterns in employee behavior. Such data can offer insights that limited internal data cannot.

Data Sources

HR professionals gather data points across the organization from sources like:

  • Employee surveys
  • Attendance records
  • Employee reviews
  • Salary and promotion history
  • Employee work history
  • Demographic data
  • Personality data
  • Recruitment process
  • Employee databases

Data Collection Plans

A data collection plan is a guide that identifies goals, objectives, and special focus areas, and lays out timelines, procedures, and best practices for collecting data. You will need to follow a series of steps to ensure that data collection process is stable and reliable

  • formulate a clear statement of the problem
  • define and list the characteristics to be measured
  • select the right measurement technique
  • construct a clear and simple data collection form
  • arrange the sampling method
  • determine who will collect the data, who will analyze and interpret the data, and who will report the results

Data Collection Methods

Few types of data collection methods includes

  • Check sheets – It is a structured, well-prepared form for collecting and analyzing data consisting of a list of items and some indication of how often each item occurs. There are several types of check sheets like confirmation check sheets for confirming whether all steps in a process have been completed, process check sheets to record the frequency of observations with a range of measurement, defect check sheets to record the observed frequency of defects and stratified check sheets to record observed frequency of defects by defect type and one other criterion. It is easy to use, provides a choice of observations and good for determining frequency over time. It should be used to collect observable data when the collection is managed by the same person or at the same location from a process.
  • Coded data- It is used when presence of too many digits are to be recorded into small blocks or during data capturing of large sequences of digits from a single observation or rounding off errors are observed whilst recording large digit numbers. It is also used if numeric data is used to represent attribute data or data quantity is not enough for a statistical significance in the sample size. Various types of coded data collection are
  • Truncation coding for storing only 3,2 or 9 for 1.0003, 1.0002, and 1.0009
  • Substitution coding – It stores fractional observation, as integers like expressing the number 32 for 32-3/8 inches with 1/8 inch as base.
  • Category coding – Using a code for category like “S” for scratch
  • Adding/subtracting a constant or multiplying/dividing by a factor – It is usually used for encoding or decoding
  • Automatic measurements – In it a computer or electronic equipment performs data gathering without human intervention like radioactive level in a nuclear reactor. The equipment observes and records data for analysis and action. Technological tools for automated data collection include video recording, self-recording test equipment, computers with verifications and crosschecking, bar codes, magnetic strips, scanning devices, and radio-frequency identification (RFID).

Guidelines before data collection

  • Is there a genuine benefit to collect this data? – There is always the temptation to collect data, just in case you need it later, or because it can have some minor value to the company. When making informed HR data decisions, it’s best to limit data collection to what is truly valuable and necessary for the business to run successfully.
  • Could the intended purpose of the data collection have any negative ramifications to employees? – Consider whether employees would be okay with this information being collected, and whether the data could be used to negatively impact their job or opportunities at work.
  • How could this data be misused? – A lot of problems with collecting employee’s personal information relate to misuse and abuse.
  • Are HR allowed to collect and process this data in the in the locations where employees work? – You can have the best idea to improve an HR practice, but if the data collection is not allowed where employees are working, it’s a non-starter. If you aren’t sure whether you can collect a certain piece of data, check each country’s data collection guidelines.

For having an effective collection of data, the data being collected must be valid, reliable and bias free. These characteristics only will make the process more useful and hold up to the scrutiny while performing data analysis. Three key terms that refer to accuracy in data collection are – Reliability, Validity, and Margin of Error.

  • Reliability: Reliability refers to the consistency of the data collection method. The higher the sample size is in relation to the population size, the more reliable it is.
  • Validity: Validity refers to the accuracy of the data collection efforts being made. The purpose here is to analyze that the chosen data collection method truly measures what it seeks to measure, and if it does, then it must be considered valid.
  • Margin of Error: Margin of error ties into our surveys as they are subject to some uncertainty about how well a sample represents a population, and the validity and reliability of the testing tool. In this case it becomes important to make every effort to guarantee that the data is free of errors as it may affect both the reliability and the validity. This implies that the error shouldn’t be so significant that it prevents from reaching valid conclusions.

Primarily there are two types of errors such as sampling error and non-sampling error.

  • Sampling Error: A sampling error is statistical in nature and is caused by human error. The sampling error from surveying is where a portion of the population is surveyed versus getting a representative sample from the entire population.
  • Non-Sampling Errors: Non-sampling error, are statistical in nature and is caused by human error.
HR Metrics
HR Data Analysis

Get industry recognized certification – Contact us

keyboard_arrow_up
Open chat
Need help?
Hello 👋
Can we help you?