Find Published Data

With the obligation to share research data, more and more datasets are available online or via dedicated services.

Here are a few tips for finding data. Once found, it is still necessary to ensure that it is possible to re-use them.

Using Google's advanced search functions

Google makes it easy to search for a subject followed by the term ‘data’, but did you know that the search operators site: and filetype: allow you to restrict the results to certain source websites and/or certain file types?

For example, if you enter in the Google search :

transports site:admin.ch filetype:xlsx

The results will only contain Excel files available on Swiss government websites relating to transports.

In addition, the advanced search available by clicking on ‘Tools’ allows you to filter by country, which can be useful when looking for official statistics, for example :

Capture d'écran d'une recherche google avec la syntaxe avancée et le filtre outils

Use a dataset search service, such as Google Dataset Search

Google also offers a search interface dedicated specifically to datasets: https://datasetsearch.research.google.com/

The results are presented in such a way that you can see at a glance the information that is essential for re-use, such as the date the data was made available, the licence and the organisation making the data available.

Google Dataset search screenshot

Request data from a data extraction service, such as InfoDesk at HUG

The HUG InfoDesk service can be asked to export data from the HUGData DataLake. This is a treasure of clinical and administrative information, with, for example, more than 1.9 million patients, 268 million laboratory analyses, 153 million prescriptions, etc.

Requests for extraction must be made using an eProcess InfoDesk form on the HUG intranet (HUG authentication required). The request follows an institutional approval workflow involving various departments. In particular, the extraction possibilities linked to the structure and scope of the HUGData database, compliance with laws, HUG directives, rules of good practice, obtaining the agreement of the research authorities, patient consents, etc. are taken into account.

The service is free of charge. The time taken to obtain the data varies, depending on the number and nature of the requests received.

infodesk.png

 

Search for a data paper or check the data statements of articles published on the subject you are interested in

If you are more comfortable searching for scientific publications, it is useful to check whether there are publications on your subject, and whether these describe datasets and indicate where they are accessible. In addition to research publications, some articles, known as ‘Data Papers’, deal specifically with datasets that have been made available.

Once a publication has been identified, it is advisable to check in the ‘data availability statement’ section of the article, in its bibliographic references, or even sometimes in the ‘supplementary materials’, whether there is any trace of one or more reusable datasets.

Note that some databases, such as the Archive ouverte UNIGE or Web of Science, offer specific filters to limit the results to ‘data papers’ or ‘data articles’.

Filtre data papers dans l'Archive ouverte UNIGE

Others, like PubMed or the Archive ouverte UNIGE, allow to search specifically for articles with associated datasets.

Filtre Associated Data dans PubMed

Consult data repositories, such as Yareta and others

It is possible to search for datasets directly in data repositories, which are servers dedicated to preserving datasets. At the UNIGE, the institutional data repository is called Yareta. It provides a search interface for identifying and accessing datasets (or requesting access to those that are referenced, but whose download is regulated by their depositors).

Depending on the discipline or type of data you are interested in, other data repositories can also be consulted. There are over 3,350 of them, according to the re3data.org directory. The repository can be searched using various criteria, such as discipline. Once you have identified the data repository, you need to search for the datasets directly in the repository, via its website. Unfortunately, the re3data tool does not allow federated searching of the content of the data repositories listed.

Some examples, apart from Yareta :

 

 

Identify data-producing services or institutions and check their websites directly

More and more services and institutions are sharing open data sets. They usually indicate this on their website.

This is the case, for example, with museums, libraries and archive services that offer resources such as EuropeanaGallica, Retronews, etc.

In Switzerland, the https://opendata.swiss/en platform allows you to search for datasets produced by Swiss public administration services such as MétéoSuisse or swisstopo, to name just a few:

Screenshot of the Opendata swiss website 

 

 

Last update: July 11, 2025