¿Cuánto consume la IA? ¿Qué huella ecológica tiene? ¿Cuál es su impacto energético? Resolver estas preguntas es complicado por varias cuestiones, como que en primer lugar no hay información por parte de las empresas que producen algoritmos, a pesar de que estén todo el tiempo comunicando sus narrativas de éxito.
Otra de las cuestiones es que tendemos a tratar la IA como una sola tecnología, cuando son muchas herramientas, procesos, materiales, industrias, servidores, localizaciones. Los estudios publicados por las tecnológicas se dejan fuera las primeras partes del proceso (extracción de materiales y datos, entrenamientos), y los últimos datos de investigadores independientes que tenemos son de antes de la pandemia.
Una de las promesas de la IA es que viene a solucionarnos los grandes problemas del futuro, entre ellos el cambio climático. Pero ¿cuánto consume hoy? ¿y si cuando llegamos a la AGI ya no nos queda planeta? La IA promete ayudar contra el cambio climático en un futuro, pero su desarrollo actual puede estar haciendo un daño al planeta que su remedio no pueda arreglar.
Esta semana hemos publicado un tema que he perseguido durante año y medio: conseguimos entrevistar al nuevo Chief Data Officer de España para saber en qué ha estado trabajando. Antes había hablado con personas que trabajan con datos públicos: un estadístico, una periodista que cubrió la pandemia, y había escrito cuánto podía ayudar un Chief Data Officer en la gestión de datos públicos.
Dependiente de la Secretaría de Estado de Digitalización e Inteligencia Artificial, su oficina es “estratégica” por lo que no ejecuta presupuesto. Dice que constantemente le piden datos, pero él no los gestiona: solo ayuda a definir políticas y consensuar estrategias. Una de las principales tareas de la Oficina del Dato es “evangelizar” y difundir la cultura del dato. En año y medio ha dado 3 entrevistas: a 2 medios pequeños y a una revista de la administración pública.
Algo que me llamó la atención fue su visión sobre el tema de los datos abiertos, y creo que esto interesa a todos los que trabajamos con datos públicos, como los periodistas de datos. Le pregunté: ¿El año que viene podríamos ver que los ministerios publiquen datos en formatos más abiertos que un PDF? Me dijo: «Una cosa son los datos abiertos y otra los datos interoperables», y su estrategia está más orientada a ellos. «Creo que el diseño de los fondos de recuperación no puede estar necesariamente ligado a lo que son los datos abiertos, sino a una gestión que tenga más impacto a corto plazo».
Hasta hace poco cuando hablábamos de brecha digital se usaban métricas de penetración de internet, cantidad de dispositivos o líneas de datos de ADSL entre la población, cantidad de personas que decían haber utilizado internet en el último mes. Esto hace rato que ha quedado atrás. En el mundo muchas cosas han cambiado y no es raro encontrar números bastante altos de porcentaje de líneas móviles en países africanos, donde un móvil con una conexión es algo básico para la supervivencia.
La brecha digital también se transforma, y ahora el precipicio se abre entre las formas en las que usamos la red unos y otros. Incluso en países donde casi todos tenemos internet, un grupo de la población, el de las personas mayores, se está quedando afuera y reclaman su inclusión.
El primer estudio con análisis masivo de datos de telecomunicaciones de un país entero, europeo, desarrollado, revela que con un acceso igualitario hay dos grandes grupos de comportamiento en la red, que sobre todo, están relacionados fuertemente con la renta media y el nivel educativo de la población. Cuando leí el estudio, enseguida me puse en contacto con dos de sus investigadores, Esteban Moro e Iñaki Úcar, para hacerles muchas preguntas. "Nosotros vimos que no todos veían la misma internet", me dijo Esteban.
This entry is part of some findings in the exercises for the MOOC Data visualization for storytelling and discovery.
In the last few years there’s been some raising numbers in the spreading of viral illnesses that are completely avoidable by vaccines. Measles is one of them and I’ve downloaded the dataset of the World Bank for the last years to analize that information by country and by groups of them. The last data is from 2015.
Measles is a highly contagious infectious disease caused by a virus, and it can lead to complications including pneumonia and encephalitis. In 2016, there were 89,780 measles deaths globally – marking the first year measles deaths have fallen below 100,000 per year.
The World Health Organization has recommended that to achieve herd immunity, more than 95 % of the community must be vaccinated. As a result of widespread vaccination, the disease was declared eliminated from the Americas in 2016. It, however, occurred again in 2017 and 2018 in this region.
Studies have shown that if an unvaccinated minority (around 5-10%) remains small, herd immunity can still be effective. A problem arises when the minority begins to grow.
The world map shows the countries in a sequencial colour scale where in the vivid red shows the areas where the percentage of children immunized runs under 85 %. In the orangish medium tone we can see those countries where this ratio sits between 85-95 % which is not enough to prevent spreading of measles. Only those countries with more than 95 % of the children vaccinated are safe from measles, they are the lightest hue in the map.
Most of the lowest numbers of countries where children are protected against measles are in Africa, with many in Oceania as well. But also a continent with traditionally good healthcare policies as Europe is not completely safe.
Ukraine, Bosnia, Serbia and Macedonia are under 85 %, and countries as France, United Kingdom, Ireland, Italy, The Netherlands, Denmark, Croatia, Slovenia, Switzerland, Finland, Estonia, Latvia, Lituania, Belarus, Moldovia, Romania and Bulgaria stay behind the 95 % of vaccination. In Europe, eighteen countries — Austria, Belgium, Iceland, Luxembourg, the Netherlands, Spain and Sweden among them — reported more cases of measles during the first half of 2017 than during the same period in 2016, according to the European Centre for Disease Prevention and Control.
Also rich countries in America, such as United States and Canada don’t get a 95 % of immunization.
The distribution shows a median of 92, which falls apart from the recommendation of the WHO for 3 points. There’s a definite outlier with only 20 % of vaccinated children, South Sudan. It’s a new country that has suffered ethnic violence and has been in a civil war since 2013, and is acknowledged to have some of the worst health indicators in the world.
Equatorial Guinea is the next outlier, with 27 % of vaccination, and in spite of being one of sub-Saharan Africa’s largest oil producers the wealth is distributed extremely unevenly. The country’s authoritarian government is cited as having one of the worst human rights records in the world. Less than half of the population has access to clean drinking water and that 20 % of children die before reaching the age of five.
Countries – Measles & health expenditure per capita
If we consider the variable of health expenditure per capita in USD we can explore some interesting cases. There’s an outlier also in this case: San Marino. Health expenditure media of all countries is USD 1,005 per capita. San Marino spends 3,243 USD in public health, almost 3 times more than the media and still has very low numbers of immunized population against measles. It does not seem to be a problem of money.
The correlation between public expenditure on health and vaccination is interesting because it shows that most of the countries above the levels of immunizations recommended by the WHO don’t necessary spend higher levels on public health. Tanzania, with the lowest amount, only 37 USD reaches a 99 % immunization, and there’s a similar correlation in other countries: Russia, Mexico, Turkey, Vietnam, Georgia, Latvia, Poland, El Salvador, Rwanda, Seychelles, Nauru, and others that stay below the mean and still make the WHO achievement of immunization.
Cuba is perhaps the most cited example of efficiency in health public policies, and in this case can be it too: with only 817 USD got 99 % of it’s population immunized. As I said before, it’s really compelling that rich countries with higher levels of GDP and also higher health expenditure per person as Canada, the United States, Denmark or France don’t get a 95% of immunization.
In a scatterplot that shows groups of countries or continents there are other observations that we can remark or take as a clue for further research. The mean of the whole world in these variables is 1001,66 USD on health expenditure per capita, and a 84 % of children vaccinated. So we can see that there’s still work to do in this area, cause it’s 11 points below of what WHO recommends.
South Asia and Sub-Sahara Africa are the less immunized groups of countries. Fragile and conflicted affected situations states, low income, and heavily indebted poor, and least developed countries as per UN cualification, are those in which we can see a strong correlation with less percentage of children vaccinated.
No continent is completely enough immunized, though Europe and Central Asia have the closest percentages to 95 %, without reaching it. The OECD members have a 94.48 %. The countries that reach the measles vaccination goal of the WHO have only one group in common: they are all upper middle income countries.
These explorations are the first observations and are intended to bring up some clues on to keep doing research. More variables should be considered in a big study like this, as well as getting into the particular economic, demographic and social situation in each country. An interesting variable could be to try to track somehow the anti-vaccines groups in some countries or states and their influence in media or social networks. I couldn’t find this kind of data but I guess that education and information should be an interesting variable to take into account here.
This entry is part of some findings in the exercises for the MOOC Data visualization for storytelling and discovery.
Excess body weight is an important risk factor for mortality and morbidity from cardiovascular diseases, diabetes, cancers, and musculoskeletal disorders. It’s the cause of nearly 3 million annual deaths worldwide. Several studies on diferent levels show that adiposity, as measured by body mass index (BMI, calculated as weight in kg over m2), has increased in recent decades in many populations although BMI seems to have been stable or even decreased in some groups.
Body mass index is a value derived from the mass (weight) and height of an individual. The BMI is defined as the body mass divided by the square of the body height.
Commonly accepted BMI ranges are: underweight: under 18.5 kg/m2, normal weight: 18.5 to 25, overweight: 25 to 30, obese: over 30. Also the World Health Organization adheres to this classification. So those are the line highlights in the Y axis of the graphs, to see which and how many countries fall into them.
Correlation with income
I used the Gapminder 2012 dataset to explore a bit.
The mean BMI provides a simplified measure of the comparative weight of populations on a country by country basis, and my first hunch was to compare the mean BMI of each country with the income per person to see how it correlates. Maps didn’t show well the gradients, as the countries that have higher values of BMI are scarce and very small in dimensions in the map. So I used a scatterplot to see countries and also continents by colour, and see the trends.
All the countries with an obese population (Nauru, Tonga, Samoa, Palau, French Polynesia) belong to the Polynesia, which may pose the question for an ethnic condition or if it’s considered necessary to use diferent parameters when studying this area.
Considering the data for women, there are more countries with more BMI index for women, and also more into the category of obesity. Besides the mentioned before, there’s Kiribati, Marshall Islands for the Polynesian, Egypt and Kuwait for Middle East and Puerto Rico, Saint Kitts and Nevis, and Bermuda for America. This may have some kind of relationship with climate and hot temperatures (?), as all of them are located near the parallel of latitud 0. Some possible clue to keep on searching.
We can see that the BMI and the income doesn’t show a clear correlation in general, so I thought it would be better to filter and to analyse by continent and country more in detail.
There are several studies stating that wealth doesn’t have a direct correlation with BMI as there are more factors involved. «The persistence and emergence of income gradients suggests that disparities in weight status are only partially attributable to poverty and that efforts aimed at reducing disparities need to consider a much broader array of contributing factors», as per Wang and Lauderdale.
In a study of the University of North Carolina, they employed microdata from China to provide the theoretical examination and empirical test of the predictions linking household income to adult BMI using both cross-sectional and panel data analysis. The results show an inverted-U shaped relationship between BMI and family income. Additional income brings about higher BMI and higher possibility of being overweight or obese for the poor than for the rich.
The median of the income per person in the Gapminder data for 2012 is only 14,460, and most of the African countries are under that median. But the rest of the countries are quite dispersed, specially in the case of East Asia and Pacific and South Asia.
The discrepancy with Asia has some particular condition. The WHO has determined that at any given BMI, Asians, including Singaporeans, generally have a higher percentage of body fat than do Caucasians. The BMI cut-off levels for Singaporeans have been revised such that a BMI 23 kg/m2 or higher marks a moderate increase in risk while a BMI 27.5 kg/m2 or more represents high risk for diabetes and cardiovascular diseases.
Besides that, and coming back to all the continents data, a histogram showed that the median for BMI is 25.56, similar to the mean, 25.14.
So in our analysis, most of the countries fall into the calification of overweight or obese, and as per several experts that’s the biggest problem in terms of alimentation that we have. More than underweighted we are eating bad food and not keeping a good metabolism balance. Also if you are poor and don’t have education you cannot resolve this situations to get the best nutrients and sustainable food at your hand. Education is one of other many variables that can have incidence in the causes of a higher BMI, such as ethniticity, and we cannot establish a serious correlation without searching deeper in other variables.
Correlation with urban population
So I wanted to watch how urban population could correlate with the BMI index. Some studies at the national level find the lifestyle of urban people as one of the main causes of higher levels of obesity in cities, independently of income. It’s the case of a study in Brazil that found that urbanization and the more developed geographic regions were positively associated with the prevalence of overweight/obesity and negatively associated with the prevalence of underweight.
In the grid of scatterplots by continent, we can effectively see a positive correlation for every group. The Asian look still very spread out, anyway. I’d study them in particular, after revising more papers on their specificities, and wouldn’t include them in a general analysis like this. But for the rest, the correlation is positive.
There are a number of reasons for the association between obesity and economic growth in many economies. Technological changes that lead to the lower food prices and increased food consumption are some of the factors that explain economic growth and obesity, as a study by Finkelstein and Ruhm proved. Those factors increase working hours, which makes more people eat in restaurants and fast food joints.
I find this kind of explorations makes us pose more and more questions every time, and I could go on an on trying to find papers on each region and different variables, as I mentioned before, such as education, urban growth (not only total population), differences per latitudes, and so on.
During last weeks I’ve been doing a MOOC on Data visualization for storytelling and discovery with Alberto Cairo, which I intensely recommend. I’ll post here some of the findings I’ve got from there. The studies are not totally finished as they would need more work to be presented as a journalistic piece, so shouldn’t be taken as more than an exercise in the learning process.
1. Dataset BiciMad
First, I wanted to go local, and I live in Madrid. In my city we have a relatively new public bike rental service, and they have their datasets available, so I got a dataset with the data on the new daily users.
In the histogram I can see the concentration and the spread of the data. There’s a curious outlier that corresponds with the maximum value of the dataset: 1446 and there’s another isolated value around 700. I find those two points like something worth of more research. Probably they correspond to the day that the service started or went open to the public.
The x axis represents the number of new users of annual tickets per day. The y line represents the number of days that those users where registered. The distribution is skewed to the right, due to the outliers to the higher values of annual passes some few (2-6) days.
The box plot shows the concentration of what could be a usual number of new users per day. The median is 132 and the mean is 133, so during that year (2014) that is the number of new users per day of this service. It could be useful to compare it with datasets of other years and other kind of information to see what variables make people decide to hop on bikes as a way of transportation in the city.
2. Second case: Comparing education expenditure (%) with GINI Index in the last years in Argentina
I was born in Argentina, and there we have been having not very good official statistics in the last years in terms of transparency, so getting good analysis on that kind of data is usually extremely complicated.
So I used data of the World Bank on three variables: total government expenditure on education, school enrollment primary private and GINI index. I know GINI is made of several indicators and not only education but I wanted to give it a try and see how it correlates.
I used data from 1980 to 2015. The highest expenditure in education in general was in 2015, with 5.875 % of the GDP. In 1980 there is an outlier point with 2,6 % of GDP expended before a dark period of 15 years where there are no registry or the data we have goes below 2,6 %.
From 1996 the line rises and shows a positive evolution until the last year in the series (2015), with some hiccup between 2002 and 2005, the years of the default crisis and political unstability in Argentina. The trend overall is positive, with a rank correlation of 0.86 (using Spearman’s Rank Correlation).
The GINI index is the most commonly used measurement of inequality. A Gini coefficient of 1 (or 100%) expresses maximal inequality among values. So if the GINI index goes down it’s best in terms of equality for the country. For OECD countries, in the late 20th century, considering the effect of taxes and transfer payments, the income Gini coefficient ranged between 0.24 and 0.49.
When I added the GINI index using the colors in the values, I found that there’s a positive correlation, as in the last years where the expenditure on education is higher, the GINI index goes down (which means that Argentina gets closer to equality). There are some quite interesting periods of time, anyway, when this correlation does not happen.
One is during 1980-1990 the expenditure was lower, quite less than 2,6%, and the GINI index kept below 45. It should be said that we have some missing values those years, and we should investigate further to reach any conclusion.
The other is an outlier in 2001, when the government expenditure on education is 4.833740234, the highest in the period until 2009, but the GINI index in that year is the highest of the total number of observations, that is very bad for the equality in the country. I find this observation interesting as 2001 is one of the worst years of the crisis, when Argentina went into financial default.
Muchos quieren ver a Facebook arder. También a Cambridge Analytica, qué duda hay. Hay una fila masiva para poner en duda el triunfo de Donald Trump, y otra en Europa para cuestionar el del Leave en el Brexit. Pero el gran problema que se ha revelado esta semana excede a una sola compañía, por más grande que sea; a un solo presidente, por más que sea el del país más poderoso del mundo; y a un proceso político en la Unión Europea. Estamos teniendo un problemón con nuestra democracia, la privacidad y la libertad de nuestros ciudadanos. Todo eso junto y mezclado.