Posts with the tag « data » :

🔗 Mona Chalabi on storytelling, the power of data, and covering Palestine


It’s funny how a lot of people viewed me as a rigorous journalist on every other topic. And when it came to this, all of a sudden there was this disbelief in my method of research. There was this suspicion that all of a sudden it wasn’t rigorous. I think that really, really speaks to the very, very, very deeply entrenched biases that exist around this subject.

🔗 Data Organisation in Spreadsheets


The basic principles are: be consistent, write dates like YYYY-MM-DD, do not leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, do not include calculations in the raw data files, do not use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text files.

🔗 GitHub - maradam4/COVID19-Egypt-dataset


بعد مرور 3 أشهر على ظهور أول حالة إيجابية مصابة بفيروس كورونا (كوفيد-19) في مصر، تم العمل على عدد من مجموعات البيانات (Datasets)، بهدف اتاحتها للجميع. نرحب بالتعليقات والاضافات والعمل التشاركي على المشروع.

🔗 How to Cite Datasets and Link to Publications | Digital Curation Centre


"This guide will help you create links between your academic publications and the underlying datasets, so that anyone viewing the publication will be able to locate the dataset and vice versa. It provides a working knowledge of the issues and challenges involved, and of how current approaches seek to address them. This guide should interest researchers and principal investigators working on data-led research, as well as the data repositories with which they work."

🔗 Data Skills for Reproducible Science


"This course provides an overview of skills needed for reproducible research and open science using the statistical programming language R. Students will learn about data visualisation, data tidying and wrangling, archiving, iteration and functions, probability and data simulations, general linear models, and reproducible workflows. Learning is reinforced through weekly assignments that involve working with different types of data."

🔗 OpenRefine


OpenRefine is a powerful free, open source tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.

(Originally Google Refine)

🔗 Suicide rates in people of South Asian origin in England and Wales: 1993-2003 -- McKenzie et al. 193 (5): 406 -- The British Journal of Psychiatry

The South Asian Name and Group Recognition Algorithm (SANGRA) identifies South Asian individuals in data-sets by matching their names to the names in its directory. SANGRA has been validated using health-related electronic data containing names and self-assigned ethnicity, and has been used in a number of other epidemiological studies. Its reported sensitivity is 89–96% and specificity 94–98% for self-assigned ethnicity census categories Asian Bangladeshi',Asian Indian' or `Asian Pakistani'.

🔗 Faulty Election Data – tehranbureau


The best evidence for the validity of the arguments of the three opponents of the President for rejecting the results declared by the Interior Ministry is the data the Ministry itself has issued. In the chart below, compiled based on the data released by the Ministry and announced by Iran’s national television, a perfect linear relation between the votes received by the President and Mir Hossein Mousavi has been maintained, and the President’s vote is always half of the President’s. The vertical axis (y) shows Mr. Mousavi’s votes, and the horizontal (x) the President’s. R^2 shows the correlation coefficient: the closer it is to 1.0, the more perfect is the fit, and it is 0.9995, as close to 1.0 …