Intelligent Data : 2019

Health Science

Nanotechnology ("nanotech") is manipulation of matter on an atomic, molecular, and supramolecular scale. The earliest, widespread description of nanotechnology referred to the particular technological goal of precisely manipulating atoms and molecules for fabrication of macroscale products, also now referred to as molecular nanotechnology. A more generalized description of nanotechnology was subsequently established by the National Nanotechnology Initiative, which defines nanotechnology as the manipulation of matter with at least one dimension sized from 1 to 100 nanometers. This definition reflects the fact that quantum mechanical effects are important at this quantum-realm scale, and so the definition shifted from a particular technological goal to a research category inclusive of all types of research and technologies that deal with the special properties of matter which occur below the given size threshold. It is therefore common to see the plural form "nanotechnologies" as well as "nanoscale technologies" to refer to the broad range of research and applications whose common trait is size.

Nanotechnology as defined by size is naturally very broad, including fields of science as diverse as surface science, organic chemistry, molecular biology, semiconductor physics, energy storage,microfabrication, molecular engineering, etc. The associated research and applications are equally diverse, ranging from extensions of conventional device physics to completely new approaches based upon molecular self-assembly, from developing new materials with dimensions on the nanoscale to direct control of matter on the atomic scale.

Scientists currently debate the future implications of nanotechnology. Nanotechnology may be able to create many new materials and devices with a vast range of applications, such as in nanomedicine, nanoelectronics, biomaterials energy production, and consumer products. On the other hand, nanotechnology raises many of the same issues as any new technology, including concerns about the toxicity and environmental impact of nanomaterials, and their potential effects on global economics, as well as speculation about various doomsday scenarios. These concerns have led to a debate among advocacy groups and governments on whether special regulation of nanotechnology is warranted.

biogerontology science

Robotics is an interdisciplinary branch of engineering and science that includes mechanical engineering, electronic engineering, information engineering, computer science, and others. Robotics deals with the design, construction, operation, and use of robots, as well as computer systems for their control, sensory feedback, and information processing.

These technologies are used to develop machines that can substitute for humans and replicate human actions. Robots can be used in many situations and for lots of purposes, but today many are used in dangerous environments (including bomb detection and deactivation), manufacturing processes, or where humans cannot survive (e.g. in space, under water, in high heat, and clean up and containment of hazardous materials and radiation). Robots can take on any form but some are made to resemble humans in appearance. This is said to help in the acceptance of a robot in certain replicative behaviors usually performed by people. Such robots attempt to replicate walking, lifting, speech, cognition, or any other human activity. Many of today's robots are inspired by nature, contributing to the field of bio-inspired robotics.

The concept of creating machines that can operate autonomously dates back to classical times, but research into the functionality and potential uses of robots did not grow substantially until the 20th century. Throughout history, it has been frequently assumed by various scholars, inventors, engineers, and technicians that robots will one day be able to mimic human behavior and manage tasks in a human-like fashion. Today, robotics is a rapidly growing field, as technological advances continue; researching, designing, and building new robots serve various practical purposes, whether domestically, commercially, or militarily. Many robots are built to do jobs that are hazardous to people, such as defusing bombs, finding survivors in unstable ruins, and exploring mines and shipwrecks. Robotics is also used in STEM (science, technology, engineering, and mathematics) as a teaching aid. The advent of nanorobots, microscopic robots that can be injected into the human body, could revolutionize medicine and human health.

Robotics is a branch of engineering that involves the conception, design, manufacture, and operation of robots. This field overlaps with electronics, computer science, artificial intelligence, mechatronics, nanotechnology and bioengineering.

Greek Science T-Shirts

Statistics is the discipline that concerns the collection, organization, analysis, interpretation and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments. See glossary of probability and statistics.

When census data cannot be collected, statisticians collect data by developing specific experiment designs and survey samples. Representative sampling assures that inferences and conclusions can reasonably extend from the sample to the population as a whole. An experimental study involves taking measurements of the system under study, manipulating the system, and then taking additional measurements using the same procedure to determine if the manipulation has modified the values of the measurements. In contrast, an observational study does not involve experimental manipulation.

Two main statistical methods are used in data analysis: descriptive statistics, which summarize data from a sample using indexes such as the mean or standard deviation, and inferential statistics, which draw conclusions from data that are subject to random variation (e.g., observational errors, sampling variation). Descriptive statistics are most often concerned with two sets of properties of a distribution (sample or population): central tendency (or location) seeks to characterize the distribution's central or typical value, while dispersion (or variability) characterizes the extent to which members of the distribution depart from its center and each other. Inferences on mathematical statistics are made under the framework of probability theory, which deals with the analysis of random phenomena.

A standard statistical procedure involves the test of the relationship between two statistical data sets, or a data set and synthetic data drawn from an idealized model. A hypothesis is proposed for the statistical relationship between the two data sets, and this is compared as an alternative to an idealized null hypothesis of no relationship between two data sets. Rejecting or disproving the null hypothesis is done using statistical tests that quantify the sense in which the null can be proven false, given the data that are used in the test. Working from a null hypothesis, two basic forms of error are recognized: Type I errors (null hypothesis is falsely rejected giving a "false positive") and Type II errors (null hypothesis fails to be rejected and an actual relationship between populations is missed giving a "false negative"). Multiple problems have come to be associated with this framework: ranging from obtaining a sufficient sample size to specifying an adequate null hypothesis.[citation needed]

Measurement processes that generate statistical data are also subject to error. Many of these errors are classified as random (noise) or systematic (bias), but other types of errors (e.g., blunder, such as when an analyst reports incorrect units) can also occur. The presence of missing data or censoring may result in biased estimates and specific techniques have been developed to address these problems.

The earliest writings on probability and statistics, statistical methods drawing from probability theory, date back to Arab mathematicians and cryptographers, notably Al-Khalil (717–786)and Al-Kindi (801–873).In the 18th century, statistics also started to draw heavily from calculus. In more recent years statistics has relied more on statistical software to produce tests such as descriptive analysis.

DNA Testing

Data mining is the process of discovering patterns in large data sets involving methods at the intersection of machine learning, statistics, and database systems. Data mining is an interdisciplinary subfield of computer science and statistics with an overall goal to extract information (with intelligent methods) from a data set and transform the information into a comprehensible structure for further use. Data mining is the analysis step of the "knowledge discovery in databases" process or KDD. Aside from the raw analysis step, it also involves database and data management aspects, data pre-processing, model and inference considerations, interestingness metrics, complexity considerations, post-processing of discovered structures, visualization, and online updating.

The term "data mining" is a misnomer, because the goal is the extraction of patterns and knowledge from large amounts of data, not the extraction (mining) of data itself. It also is a buzzword and is frequently applied to any form of large-scale data or information processing (collection, extraction, warehousing, analysis, and statistics) as well as any application of computer decision support system, including artificial intelligence (e.g., machine learning) and business intelligence. The book Data mining: Practical machine learning tools and techniques with Java (which covers mostly machine learning material) was originally to be named just Practical machine learning, and the term data mining was only added for marketing reasons. Often the more general terms (large scale) data analysis and analytics – or, when referring to actual methods, artificial intelligence and machine learning – are more appropriate.

The actual data mining task is the semi-automatic or automatic analysis of large quantities of data to extract previously unknown, interesting patterns such as groups of data records (cluster analysis), unusual records (anomaly detection), and dependencies (association rule mining, sequential pattern mining). This usually involves using database techniques such as spatial indices. These patterns can then be seen as a kind of summary of the input data, and may be used in further analysis or, for example, in machine learning and predictive analytics. For example, the data mining step might identify multiple groups in the data, which can then be used to obtain more accurate prediction results by a decision support system. Neither the data collection, data preparation, nor result interpretation and reporting is part of the data mining step, but do belong to the overall KDD process as additional steps.

The difference between data analysis and data mining is that data analysis is used to test models and hypotheses on the dataset, e.g., analyzing the effectiveness of a marketing campaign, regardless of the amount of data; in contrast, data mining uses machine learning and statistical models to uncover clandestine or hidden patterns in a large volume of data.

The related terms data dredging, data fishing, and data snooping refer to the use of data mining methods to sample parts of a larger population data set that are (or may be) too small for reliable statistical inferences to be made about the validity of any patterns discovered. These methods can, however, be used in creating new hypotheses to test against the larger data populations.

Genetic Research

Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task. Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop a conventional algorithm for effectively performing the task.

Machine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a field of study within machine learning, and focuses on exploratory data analysis through unsupervised learning. In its application across business problems, machine learning is also referred to as predictive analytics.

Machine learning tasks are classified into several broad categories. In supervised learning, the algorithm builds a mathematical model from a set of data that contains both the inputs and the desired outputs. For example, if the task were determining whether an image contained a certain object, the training data for a supervised learning algorithm would include images with and without that object (the input), and each image would have a label (the output) designating whether it contained the object. In special cases, the input may be only partially available, or restricted to special feedback.[clarification needed] Semi-supervised learning algorithms develop mathematical models from incomplete training data, where a portion of the sample input doesn't have labels.

Classification algorithms and regression algorithms are types of supervised learning. Classification algorithms are used when the outputs are restricted to a limited set of values. For a classification algorithm that filters emails, the input would be an incoming email, and the output would be the name of the folder in which to file the email. For an algorithm that identifies spam emails, the output would be the prediction of either "spam" or "not spam", represented by the Boolean values true and false. Regression algorithms are named for their continuous outputs, meaning they may have any value within a range. Examples of a continuous value are the temperature, length, or price of an object.

In unsupervised learning, the algorithm builds a mathematical model from a set of data that contains only inputs and no desired output labels. Unsupervised learning algorithms are used to find structure in the data, like grouping or clustering of data points. Unsupervised learning can discover patterns in the data, and can group the inputs into categories, as in feature learning. Dimensionality reduction is the process of reducing the number of "features", or inputs, in a set of data.

Active learning algorithms access the desired outputs (training labels) for a limited set of inputs based on a budget and optimize the choice of inputs for which it will acquire training labels. When used interactively, these can be presented to a human user for labeling. Reinforcement learning algorithms are given feedback in the form of positive or negative reinforcement in a dynamic environment and are used in autonomous vehicles or in learning to play a game against a human opponent.

Other specialized algorithms in machine learning include topic modeling, where the computer program is given a set of natural language documents and finds other documents that cover similar topics. Machine learning algorithms can be used to find the unobservable probability density function in density estimation problems. Meta learning algorithms learn their own inductive bias based on previous experience.

In developmental robotics, robot learning algorithms generate their own sequences of learning experiences, also known as a curriculum, to cumulatively acquire new skills through self-guided exploration and social interaction with humans. These robots use guidance mechanisms such as active learning, maturation, motor synergies, and imitation

Clinical Trials

In computer science, artificial intelligence (AI), sometimes called machine intelligence, is intelligence demonstrated by machines, in contrast to the natural intelligence displayed by humans. Leading AI textbooks define the field as the study of "intelligent agents": any device that perceives its environment and takes actions that maximize its chance of successfully achieving its goals.Colloquially, the term "artificial intelligence" is often used to describe machines (or computers) that mimic "cognitive" functions that humans associate with the human mind, such as "learning" and "problem solving".

As machines become increasingly capable, tasks considered to require "intelligence" are often removed from the definition of AI, a phenomenon known as the AI effect.A quip in Tesler's Theorem says "AI is whatever hasn't been done yet."For instance, optical character recognition is frequently excluded from things considered to be AI, having become a routine technology.Modern machine capabilities generally classified as AI include successfully understanding human speech,competing at the highest level in strategic game systems (such as chess and Go), autonomously operating cars, intelligent routing in content delivery networks, and military simulations.

Artificial intelligence was founded as an academic discipline in 1955, and in the years since has experienced several waves of optimism, followed by disappointment and the loss of funding (known as an "AI winter"), followed by new approaches, success and renewed funding. For most of its history, AI research has been divided into subfields that often fail to communicate with each other. These sub-fields are based on technical considerations, such as particular goals (e.g. "robotics" or "machine learning"), the use of particular tools ("logic" or artificial neural networks), or deep philosophical differences.Subfields have also been based on social factors (particular institutions or the work of particular researchers).

The traditional problems (or goals) of AI research include reasoning, knowledge representation, planning, learning, natural language processing, perception and the ability to move and manipulate objects. General intelligence is among the field's long-term goals. Approaches include statistical methods, computational intelligence, and traditional symbolic AI. Many tools are used in AI, including versions of search and mathematical optimization, artificial neural networks, and methods based on statistics, probability and economics. The AI field draws upon computer science, information engineering, mathematics, psychology, linguistics, philosophy, and many other fields.

The field was founded on the assumption that human intelligence "can be so precisely described that a machine can be made to simulate it". This raises philosophical arguments about the nature of the mind and the ethics of creating artificial beings endowed with human-like intelligence. These issues have been explored by myth, fiction and philosophy since antiquity. Some people also consider AI to be a danger to humanity if it progresses unabated. Others believe that AI, unlike previous technological revolutions, will create a risk of mass unemployment.

In the twenty-first century, AI techniques have experienced a resurgence following concurrent advances in computer power, large amounts of data, and theoretical understanding; and AI techniques have become an essential part of the technology industry, helping to solve many challenging problems in computer science, software engineering and operations research.

Genetic Research

Data technology has been used to manage big data sets, build solutions for data management and integrate data from various sources to discover new business or analytical insights from collected information.

Growing global amount of generated data (the number is forecast to reach 163 zettabytes in 2025) determines spendings on technologies that help control data assets. The big data market is expected to reach $156.72 billion by 2026. Spendings on data, including data technologies, in digital marketing reach $26.0 B in 2019 globally.

Data technologies include various solutions that are focused on data. These technologies are developed to help manage data generated by human or by machines, which will be 200 billion by 2020. Data technologies aim to manage growing data streams, get valuable insights from data and find solutions to integrate the most important data sources for companies and organizations. Therefore, key areas for DataTech sector are:

Data Management Technologies - technologies and platforms for managing growing sets of data, such as data generated by customers (1st, 2nd and 3rd party data). Common platforms for managing data are Data Management Platform or Customer Data Platform.

Data Integration - services that match the data from two or more sources to get more information about stored data. If company collects user data in Customer-relationship management system, it can enrich it with the data from external sources to create 360-customer view (by integrating data, the company will know e.g. interests, demography and intentions of users who are in their databases).
Data Consulting - services based on analysing customer data and discovering insights from big data sets. It uses Machine Learning algorithms to find useful information from chaotic data.

Technologies for AdTech sector - products and services that support digital marketing environment, including SSP, Demand-side platform and services used for targeting the right group in online campaigns.

Building strategic data ecosystem - service that allow to build data ecosystem in organization, by identifying and choosing the right data sources, integrating data and preparing adequate analytical algorithms to discover new insights about customers.

Internet of Things - products and services that helps store and manage data generated by machines.

Healthcare Jobs

"Intelligence Quotient (IQ) and Browser Usage" was a hoax study allegedly released by a Canadian company called AptiQuant Psychometric Consulting Co. on July 26, 2011, that claimed to have correlated the IQs of 100,000 internet users with which web browsers they used. Its claims that users of Microsoft's Internet Explorer had lower IQs than users of other browsers was widely covered in the media, and its revelation as a hoax was widely cited as an example of the weaknesses of the media.The speed with which the story was reported was also alleged by some to be indicative of anti-Microsoft bias.

The hoax was arranged by Tarandeep Gill, a web developer from Vancouver, British Columbia. He claimed it was to raise awareness of the outdated nature of earlier versions of Internet Explorer that still have significant market share.

It came amid a wave of other negative coverage of earlier versions of Internet Explorer.

The report was covered by many news outlets. Initially the discrepancy was explained by that the advanced computer users with high IQs were savvy enough to choose other browsers.

When the report was first covered by the BBC some readers were skeptical of its authenticity and quickly noted that the domain for the company had only been set up a month prior and that pictures of the company's staff were from the French company Central Test. Central Test began investigating the issue and said it was considering legal action against whomever had used the photos. It was initially suspected that the whole thing was a plot to spread malware; however, the PDF was examined and none was found. As the hoax was uncovered additional problems with the report were raised, the results were noted as improbable and the task of collecting 100,000 users very difficult. The address given on the website was looked up using Google Street View and it turned out to be just a parking lot.

While the story was being covered Gill posed as Leonard Howard, the fabricated owner of AptiQuant, to the media. He also wrote a blog on the AptiQuant website about how they were being sued by Internet Explorer users and that it had been receiving hate mail.

Internet Explorer users acted defensively.

Some news outlets criticized the methodology of the study, although without realizing it was a hoax.It was even described as "Junk science at its worst".Some defended the study; for example, The Register wrote, "The methodology of the study appears sound."

Mental Performance

Emotional intelligence (EI), emotional leadership (EL), emotional quotient (EQ) and emotional intelligence quotient (EIQ), is the capability of individuals to recognize their own emotions and those of others, discern between different feelings and label them appropriately, use emotional information to guide thinking and behavior, and manage and/or adjust emotions to adapt to environments or achieve one's goal(s).

Although the term first appeared in "The Communication of Emotional Meaning" paper by a member of Department of Psychology Teachers at College Columbia University Joel Robert Davitz and clinical professor of psychology in psychiatry Michael Beldoch in 1964, it gained popularity in the 1995 book "Emotional Intelligence", written by author and science journalist Daniel Goleman.Since this time, EI, and Goleman's 1995 analysis, have been criticized within the scientific community,despite prolific reports of its usefulness in the popular press.

Empathy is typically associated with EI, because it relates to an individual connecting their personal experiences with those of others. However, several models exist that aim to measure levels of (empathy) EI. There are currently several models of EI. Goleman's original model may now be considered a mixed model that combines what has since been modeled separately as ability EI and trait EI. Goleman defined EI as the array of skills and characteristics that drive leadership performance. The trait model was developed by Konstantinos V. Petrides in 2001. It "encompasses behavioral dispositions and self perceived abilities and is measured through self report". The ability model, developed by Peter Salovey and John Mayer in 2004, focuses on the individual's ability to process emotional information and use it to navigate the social environment.

Studies have shown that people with high EI have greater mental health, job performance, and leadership skills although no causal relationships have been shown and such findings are likely to be attributable to general intelligence and specific personality traits rather than emotional intelligence as a construct. For example, Goleman indicated that EI accounted for 67% of the abilities deemed necessary for superior performance in leaders, and mattered twice as much as technical expertise or IQ. Other research finds that the effect of EI markers on leadership and managerial performance is non-significant when ability and personality are controlled for, and that general intelligence correlates very closely with leadership. Markers of EI and methods of developing it have become more widely coveted in the past decade by individuals seeking to become more effective leaders. In addition, studies have begun to provide evidence to help characterize the neural mechanisms of emotional intelligence.

Criticisms have centered on whether EI is a real intelligence and whether it has incremental validity over IQ and the Big Five personality traits.

Brain Hacking

Emotional intelligence (EI), emotional leadership (EL), emotional quotient (EQ) and emotional intelligence quotient (EIQ), is the capability of individuals to recognize their own emotions and those of others, discern between different feelings and label them appropriately, use emotional information to guide thinking and behavior, and manage and/or adjust emotions to adapt to environments or achieve one's goal(s).

Although the term first appeared in "The Communication of Emotional Meaning" paper by a member of Department of Psychology Teachers at College Columbia University Joel Robert Davitz and clinical professor of psychology in psychiatry Michael Beldoch in 1964, it gained popularity in the 1995 book "Emotional Intelligence", written by author and science journalist Daniel Goleman. Since this time, EI, and Goleman's 1995 analysis, have been criticized within the scientific community, despite prolific reports of its usefulness in the popular press.

Empathy is typically associated with EI, because it relates to an individual connecting their personal experiences with those of others. However, several models exist that aim to measure levels of (empathy) EI. There are currently several models of EI. Goleman's original model may now be considered a mixed model that combines what has since been modeled separately as ability EI and trait EI. Goleman defined EI as the array of skills and characteristics that drive leadership performance. The trait model was developed by Konstantinos V. Petrides in 2001. It "encompasses behavioral dispositions and self perceived abilities and is measured through self report". The ability model, developed by Peter Salovey and John Mayer in 2004, focuses on the individual's ability to process emotional information and use it to navigate the social environment.

Studies have shown that people with high EI have greater mental health, job performance, and leadership skills although no causal relationships have been shown and such findings are likely to be attributable to general intelligence and specific personality traits rather than emotional intelligence as a construct. For example, Goleman indicated that EI accounted for 67% of the abilities deemed necessary for superior performance in leaders, and mattered twice as much as technical expertise or IQ. Other research finds that the effect of EI markers on leadership and managerial performance is non-significant when ability and personality are controlled for, and that general intelligence correlates very closely with leadership. Markers of EI and methods of developing it have become more widely coveted in the past decade by individuals seeking to become more effective leaders. In addition, studies have begun to provide evidence to help characterize the neural mechanisms of emotional intelligence.

Criticisms have centered on whether EI is a real intelligence and whether it has incremental validity over IQ and the Big Five personality traits.

Top Science Discoveries

Digital intelligence is the sum of social, emotional, and cognitive abilities that enable individuals to face the challenges and adapt to the demands of life in the digital world. An emerging intelligence fostered by human interaction with Information Technology, it has been suggested that recognition of this intelligence will expand the scope of teaching and learning in the 21st century and all aspects of one's personal and professional lives.

The term is also used in businesses to refer to the information obtained through technologies and making use of them as an online marketing strategy and intelligence in the context of cyber security such as that mapped out by Global Commission on Internet Governance. Digital intelligence in this article refers to a new type of intelligence as a human capacity that combines knowledge, ways of knowing and the ability to interact effectively in a cultural or community setting

Digital Intelligence or Digital Intelligence Quotient (DQ) has been defined as “a comprehensive set of technical, cognitive, meta-cognitive, and socio-emotional competencies that are grounded in universal moral values and that enable individuals to face the challenges and harness the opportunities of digital life” by DQ Institute. DQ does not merely refer to the skills needed to use technology more effectively or being aware of potential dangers for children who are constantly online. According to DQ Institute, DQ is all-encompassing in that it covers all areas of individuals’ digital life that ranges from personal and social identities of individuals to their use of technology, their practical, operational and technical capabilities critical for daily digital lives and careers and the potential safety and security issues in this digital age.

DQ is important in today's world as everything is technologically driven; if we do not develop a certain level of digital intelligence, we will be precluded from an increasingly digital world. As such, it is said to be essential to develop digital intelligence from an early age. DQ is also viewed to be measureable and highly learnable.

Statistical Learning and Data Science

Jim Gray envisioned "data-driven science" as a "fourth paradigm" of science that uses the computational analysis of large data as
primary scientific method and "to have a world in which all of the science literature is online, all of the science data is online, and they interoperate with each other.

data science is different from the existing practice of data analysis across all disciplines, which focuses only on explaining data sets. Data science seeks actionable and consistent pattern for predictive uses.

This practical engineering goal takes data science beyond traditional analytics. Now the data in those disciplines and applied fields that lacked solid theories, like health science and social science, could be sought and utilized to generate powerful predictive models

In November 1997, C.F. Jeff Wu gave the inaugural lecture entitled "Statistics = Data Science?" for his appointment to the H. C. Carver Professorship at the University of Michigan.
In this lecture, he characterized statistical work as a trilogy of data collection, data modeling and analysis, and decision making.

In his conclusion, he initiated the modern, non-computer science, usage of the term "data science" and advocated that statistics be renamed data science and statisticians data scientists.
Later, he presented his lecture entitled "Statistics = Data Science?" as the first of his 1998 P.C. Mahalanobis Memorial Lectures.

These lectures honor Prasanta Chandra Mahalanobis, an Indian scientist and statistician and founder of the Indian Statistical Institute.

In 2013, the IEEE Task Force on Data Science and Advanced Analytics was launched. In 2013, the first "European Conference on Data Analysis (ECDA)" was organised in Luxembourg,
establishing the European Association for Data Science (EuADS).

The first international conference: IEEE International Conference on Data Science and Advanced Analytics was launched in 2014.

In 2014, General Assembly launched student-paid bootcamp and The Data Incubator launched a competitive free data science fellowship.

In 2014, the American Statistical Association section on Statistical Learning and Data Mining renamed its journal to "Statistical Analysis and Data Mining: The ASA Data Science Journal" and
in 2016 changed its section name to "Statistical Learning and Data Science".

In 2015, the International Journal on Data Science and Analytics was launched by Springer to publish original work on data science and big data analytics. In September 2015 the Gesellschaft für Klassifikation (GfKl) added to the name of the Society "Data Science Society" at the third ECDA conference at the University of Essex, Colchester, UK.

Data scientists as "the information and computer scientists, database and software and programmers, disciplinary experts, curators and expert annotators, librarians, archivists,
and others, who are crucial to the successful management of a digital data collection" whose primary activity is to "conduct creative inquiry and analysis."

"Data science" has recently become a popular term among business executives.
However, many critical academics and journalists see no distinction between data science and statistics, whereas others consider it largely a popular term for "data mining" and "big data".
Writing in Forbes, Gil Press argues that data science is a buzzword without a clear definition and has simply replaced “business analytics” in contexts such as graduate degree programs.

In the question-and-answer section of his keynote address at the Joint Statistical Meetings of American Statistical Association, noted applied statistician Nate Silver said, “I think data-scientist is a new term for a statistician...

Statistics is a branch of science. Data scientist is slightly redundant in some way and people shouldn’t berate the term statistician.”

Similarly, in business sector, multiple researchers and analysts state that data scientists alone are far from being sufficient in granting companies a real competitive advantage and consider data scientists as only one of the four greater job families companies require to leverage big data effectively, namely: data analysts, data scientists, big data developers and big data engineers.

Resource: https://en.wikipedia.org/wiki/Big_data

Understanding Data Mining and Big Data

In April 2002, the International Council for Science (ICSU): Committee on Data for Science and Technology (CODATA)started the Data Science Journal,
a publication focused on issues such as the description of data systems, their publication on the internet, applications and legal issues.

Shortly thereafter, in January 2003, Columbia University began publishing The Journal of Data Science, which provided a platform for all data workers to present their views and exchange ideas.
The journal was largely devoted to the application of statistical methods and quantitative research.

In 2005, The National Science Board published "Long-lived Digital Data Collections:
Enabling Research and Education in the 21st Century"

defining data scientists as
"the information and computer scientists, database and software and programmers, disciplinary experts, curators and expert annotators, librarians, archivists, and others, who are crucial to the successful management of a digital data collection" whose primary activity is to
"conduct creative inquiry and analysis.

In 2001, William S. Cleveland introduced data science as an independent discipline, extending the field of statistics to incorporate "advances in computing with data" in his article "Data Science: An Action Plan for Expanding the Technical Areas of the Field of Statistics," which was published in the International Statistical Review.

In his report, Cleveland establishes six technical areas which he believed to encompass the field of data science: multidisciplinary investigations, models and methods for data, computing with data, pedagogy, tool evaluation, and theory.

Turing award winner Jim Gray imagined data science as a "fourth paradigm" of science (empirical, theoretical, computational and now data-driven)

and asserted that "everything about science is changing because of the impact of information technology" and the data deluge.

It is now often used interchangeably with earlier concepts like business analytics, business intelligence, predictive modeling, and statistics.

Data science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data.

Data science is the same concept as data mining and big data: "use the most powerful hardware, the most powerful programming systems, and the most efficient algorithms to solve problems".

Data science is a "concept to unify statistics, data analysis, machine learning and their related methods" in order to "understand and analyze actual phenomena" with data.

It employs techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, and information science.

In 2012, when Harvard Business Review called it the term "data science" became a buzzword.

Hans Rosling, featured in a 2011 BBC documentary. Nate Silver referred to data science as a sexed up term for statistics.

In many cases, earlier approaches and solutions are now simply rebranded as "data science" to be more attractive, which can cause the term to become "dilute beyond usefulness."

While many university programs now offer a data science degree, there exists no consensus on a definition or suitable curriculum contents.

To its discredit, however, many data-science and big-data projects fail to deliver useful results, often as a result of poor management and utilization of resources.

Source: https://en.wikipedia.org/wiki/Data_science