Community forum
This is an open space to discuss health research topics, feedback on the Gateway functionality and comment on resources such as datasets. Everyone is welcome to join existing discussions or start a new topic.

📺 How to add a dataset to the Gateway

Metadata onboarding tutorial series:


You can now onboard and edit your metadata directly on the Innovation Gateway. The Gateway does not hold a copy of Data Custodian data. Instead, it stores summary information used to describe each of the datasets that the Data Custodian holds.

Features

  • Where to add a dataset
  • Adding information about your dataset
  • Uploading files relating to your dataset

Transcript:

You can now onboard and edit your metadata, your datasets, directly on the Innovation Gateway. The Gateway does not hold a copy of Data Custodian data. Instead, it stores summary information used to describe each of the datasets that the Data Custodian holds. Let’s look at how we can add a dataset.

Navigate to your team dashboard by going onto your individual account and selecting the drop down beside your name. You will then be able to select and switch to a team’s view. Visit the link in the transcript to learn more about teams Creating and managing a team

Select the ‘dataset’ tab and then select the ‘add a new dataset’ button.

The form to add your dataset is split into 8 sections, which you can see here on the left hand side. Some sections have sub-sections, which you can identify by the titles with an arrow beside it.

The page you will first land on is the before you begin section. This will walk you through the process, giving you the information you need to; understand how the approval process works, how to ask HDRUK further questions, best practice and further guidance, the rules on uploading multiple datasets and understanding the metadata score.

To read these sections, select the section you require and it will open to reveal the information - like so. You can click to open the other sections and select again to close them.

The action bar at the bottom of the page will be visible throughout the whole process. On the far left you can see a grey draft status. Much like the dataset card on the dataset dashboard, this indicates that the form has not been submitted for review and is still being worked on.

You can then see the number of questions you have answered, alongside the number of questions total to answer. On the right hand side there are three buttons; archive, which allows you to archive this dataset, submit for review, which will send your application to the HDRUK admins, and next, which will move you to the next section of the form.

Once you’ve understood the process, you can begin completing this form.

Select next, or you can select the section you want to go to by clicking the relevant title on the left hand side.

As you can see, the Provenance section has two subsections - origin and temporal. I can switch between these two sections the same way I did before, by selecting next or by selecting the sub-title I want to go to.

Here I can see the title, and explanation of what this section is for. Every section will have an explanation, allowing users to understand why this information is required and important.

On the rest of the page you can see the input fields and guidance.

Much like the Data Access Request form, every input has attached guidance - allowing users to get the help they need if required. View guidance by selecting the ‘?’ button beside the input. The guidance will then appear on the right hand side.

There are a variety of different inputs across the form. On origin we can see selection inputs where you choose options from a pre-existing list, and on temporal we can see drop-down selections and date selections. As you can see, there are asterisks beside the accrual periodicity, start date and time lag inputs. This means that these inputs are mandatory, you must complete all mandatory fields before you are allowed to submit your application for review.

I’ll now fill in the inputs on the origin sub-section. As you can see, the circles beside the Origin section is now filled in with a tick. This indicates I have completed this section. This completion will be reflected on the card on the dataset dashboard.

Anything sections that you begin but don’t finish will have a half circle beside them. Any sections with nothing filled in the circle have not got any inputs completed.

Let’s have a look at other inputs - on Documentation there are free-text inputs to provide detailed information. Larger input boxes. are available for longer text.

You are able to add rows of information on some inputs, such as as associated media, by using the + and - buttons.

On the Observation section, you can add multiple sections using the ‘add another section’ button, where you can add your information to these 5 inputs fields again.

Lastly, the structural metadata section is for adding a data dictionary. The data dictionary should be submitted using the data dictionary template provided in this form. To download it, click ‘download data dictionary template’. It will provide you with a template that lists out the following fields. Guidance is available here to help you fill in the template.

Once you’ve completed the data dictionary spreadsheet - you can attach it using the ‘select file’ button under Upload. Only Excel or .xls files are accepted, and there is a file size restriction of 10Mb.

Once you’ve selected the file you can see the file name and size under upload here. You can remove it and select another one if it is the incorrect file.

When you’re happy with your file selection, click ‘upload this file.’

Once the file is uploaded you will either see a successful upload, or the system will identify any errors.

The entire spreadsheet will be available for you to view.

If there are errors you will receive a message telling you that errors have been found and that you need to upload the file again with these errors corrected. At the top it will indicate in exactly which row and column the error is. The errors are highlighted in red. Here we are told that in row 3, in the sensitive column, FAL is entered, and it should be true or false. In row 3, in the sensitive column, there is an empty field, where it should be true or false. It is helpful to look at the guidance at the top of the page to ensure that there are no errors.

When you have corrected your errors, click select file again - here you’ll get a warning that it will override your current upload. Select your updated file and upload again. Now that there are no errors, your file has no warning messages, and you will receive a message that your data has been successfully uploaded.

The next tutorial in this series will walk you through how to submit your dataset for review