🔗 CSV to Website Generator
Convert a CSV file into a well-formated HTML table.
Convert a CSV file into a well-formated HTML table.
It’s funny how a lot of people viewed me as a rigorous journalist on every other topic. And when it came to this, all of a sudden there was this disbelief in my method of research. There was this suspicion that all of a sudden it wasn’t rigorous. I think that really, really speaks to the very, very, very deeply entrenched biases that exist around this subject.
csvlens is a command line CSV file viewer. It is like less but made for CSV.
Pandas Tutor lets you write Python pandas code in your browser and see how it transforms your data step-by-step.
Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
The basic principles are: be consistent, write dates like YYYY-MM-DD, do not leave any cells empty, put just one thing in a cell, organize the data as a single rectangle (with subjects as rows and variables as columns, and with a single header row), create a data dictionary, do not include calculations in the raw data files, do not use font color or highlighting as data, choose good names for things, make backups, use data validation to avoid data entry errors, and save the data in plain text files.
Interactive charts for several health metrics. Interesting charts for COVID-19 pandemic.
بعد مرور 3 أشهر على ظهور أول حالة إيجابية مصابة بفيروس كورونا (كوفيد-19) في مصر، تم العمل على عدد من مجموعات البيانات (Datasets)، بهدف اتاحتها للجميع. نرحب بالتعليقات والاضافات والعمل التشاركي على المشروع.
"This guide will help you create links between your academic publications and the underlying datasets, so that anyone viewing the publication will be able to locate the dataset and vice versa. It provides a working knowledge of the issues and challenges involved, and of how current approaches seek to address them. This guide should interest researchers and principal investigators working on data-led research, as well as the data repositories with which they work."
A Global Database of Society
A list of over 1,000 datasets available in R packages, curated by @VincentAB (via @mf_viz) #rstats
jamovi is a new “3rd generation” statistical spreadsheet. designed from the ground up to be easy to use, jamovi is a compelling alternative to costly statistical products such as SPSS and SAS.
If you’re not planning a long-term career in academia and you only anticipate performing common statistical tests,…
"This course provides an overview of skills needed for reproducible research and open science using the statistical programming language R. Students will learn about data visualisation, data tidying and wrangling, archiving, iteration and functions, probability and data simulations, general linear models, and reproducible workflows. Learning is reinforced through weekly assignments that involve working with different types of data."
Cigarette consumption estimates for 71 countries from 1970 to 2015
how to use the R statistical software to carry out some simple analyses that are common in analysing time series data
Database query language SQL and how it compares to Excel.
Learn how to find, process, analyze and visualize data
Learn how to find, process, analyze and visualize data
Interesting.
OpenRefine is a powerful free, open source tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
(Originally Google Refine)
The South Asian Name and Group Recognition Algorithm (SANGRA) identifies South Asian individuals in data-sets by matching their names to the names in its directory. SANGRA has been validated using health-related electronic data containing names and self-assigned ethnicity, and has been used in a number of other epidemiological studies. Its reported sensitivity is 89–96% and specificity 94–98% for self-assigned ethnicity census categories Asian Bangladeshi',
Asian Indian' or `Asian Pakistani'.
The future belongs to the companies who figure out how to collect and use data successfully. In this in-depth piece, O'Reilly editor Mike Loukides examines the unique skills and opportunities that flow from data science.
Interrogation Log of Detainee 063
The best evidence for the validity of the arguments of the three opponents of the President for rejecting the results declared by the Interior Ministry is the data the Ministry itself has issued. In the chart below, compiled based on the data released by the Ministry and announced by Iran’s national television, a perfect linear relation between the votes received by the President and Mir Hossein Mousavi has been maintained, and the President’s vote is always half of the President’s. The vertical axis (y) shows Mr. Mousavi’s votes, and the horizontal (x) the President’s. R^2 shows the correlation coefficient: the closer it is to 1.0, the more perfect is the fit, and it is 0.9995, as close to 1.0 …