Content
Researchers can collect their own data - but they do not have to if they can find data owned or created by someone else that fits their needs. This is called secondary data, in contrast to primary data which is data you collect yourself. Thanks to open science and the requirements from federal funding agencies that federally funded research data is made public, there is more publicly available research data than ever before. The trick is finding it, but the effort typically pays off in some combination of lower costs, reduced time to availability, access to populations or historical data you could not collect yourself, etc.
There are many types of secondary data, as well as access requirements. Secondary data includes data collected specifically for research, but also data collected for surveillance systems (such as public health outcomes), and administrative data - data collected for the purposes of organizational operations. Access to secondary data varies, typically into three categories: open access, public access, and restricted.
- Open data is data you can download freely. Common sources are data aggregators (sites with lists of links to datasets), government data portals, and some data repositories will allow download freely or only require an email address. (Be cautious downloading data from unknown sources!)
- Public data is data that is findable, but requires some additional effort to access it. The effort required will vary by source, but typically involves one or more of the following: creating an account, signing a data agreement, having a membership, or submitting a request. Many data repositories provide access to public data, requiring users to sign off on acceptable ways to use the downloaded data.
- Restricted data is data that has specific rules associated with access, and often, how the data will be used. In some cases, a researcher may have to go to the physical location where the data is hosted, or perhaps they can visit a restricted data center (like the Central Plains Federal Statistical Research Data Center).
Some datasets may also have variable access - where some variables or versions of the data are available in different access types. For example, the National Longitudinal Study of Adolescent to Adult Health (or Add Health) has both public and restricted versions of access.
Remember!
Websites can get taken down and links can change, so make sure you download all the documentation when you download your data.
Finding Secondary Data
There are many ways to find secondary data. These are some of the most common ways:
- Searches - your favorite search engine can be a great starting. Try searching for "open sources of data" and you should find more results than one person wants to go through, many of which are likely to be data repositories and aggregators. There are also dedicates search sites for data, like Data.gov for open data from the US government and the Registry of Research Data Repositories (or re3data.org) which is a search engine for data repositories.
- Data repositories and aggregators - collections of datasets that may or may not be organized around a single field of study. Datasets found in aggregators are usually in different formats and have less information available to help you search. Examples of these include FAIRsharing.org and the Data Rescue Project. Repositories typically have requirements for formatting before data can be deposited which helps with searching. There are generalist repositories like Dryad, Zenodo, and Open Science Framework, but many repositories are discipline specific, like the InterUniversity Consortium for Political and Social Research (or ICPSR) - which archives social science data, and are others are very specific, like OpenPain, which is focused on brain imaging studies of human pain.
- Data collectors - governmental agencies, businesses, and universities often collect their own data and provide access to it. Government agency data may be accessed through a portal, such as the US Census, via dashboards like Nebraska's Department of Health and Human Services, or on a website like the US Department of Agriculture National Agricultural Statistics Service. Universities always have institutional data about their campus, employees, and students, but access to it will depend on the school. But many universities also have data collection services, like the Bureau of Sociological Research at UNL, which collects data, but also has data that can be requested by researchers.
- Citations - if you are reading literature related to your research, they are likely talking about their sources, and those datasets may be accessible to you. If a specific dataset doesn't sound quite right, you can use resources like Connected Papers, Google Scholar (related articles), or Web of Science to find related research. And the Libraries has over 500 databases you can use.
You may also become aware of dataset through your networks and the organizations you are part of. (This data may not be organized in the same way as data prepared for sharing, so keep that in mind.)
Public Data About Nebraska
The following resources are for people looking for data about Nebraska:
- All Things Nebraska portal provides access to maps, reports, and tools about Nebraskan communities.
- The Nebraska Rural Poll is the longest running poll of rural life in the country. The Rural Poll offers regular webinars sharing information from their findings, and users can also build their own interactive reports. Researchers can partner with the Rural Poll to add questions or oversample desired areas.
- The Nebraska Annual Social Indicators Survey (NASIS), an omnibus survey of Nebraskans ages 19+ that has been collected since 1977 by the Bureau of Sociological Research (BOSR). BOSR also conducts the NebrASKa Voices panel for those wanting to collect data from Nebraskans. Researchers can purchase questions on the NASIS, but can also request past datasets.
- For people looking for physical, social, and emotional health, as well as community well-being data at the sub-state level for Nebraska (e.g. region, county, zipcode, etc.), the following files list known datasets:
- The Nebraska Data Sources file is a .pdf of various data sources that describes the characteristics of the dataset, the website to find the data, and how to access the data.
- The Nebraska Data Sources by Topic is an excel file that lists the data sources from the .pdf by topics from the Healthy People 2030 objectives.
- SANDY, the UNL data repository has datasets about Nebraska