SkillsCast

Omnia mutantur, and other data mining pitfalls

8th February 2016 in London at CodeNode

There are 1 other SkillsCast available from Omnia Mutantur and Language Data

The use of data drawn from the social web or from large sets of textual resources consistently carries with it the challenge of unpredictability. This talk explores some of the issues with mining and processing of natural language information encountered during recent research projects exploring aspects of social activity on the Web. In particular, issues arise from domain-specificity and social contextualisation, which lead to difficulties in applying machine learning systems across domains and over time, such as potential variation in accuracy.

YOU MAY ALSO LIKE:

Thanks to our sponsors

Omnia mutantur, and other data mining pitfalls

Emma Tonkin

Emma Tonkin is a researcher at King's College London, where she is currently researching the application of time series analysis to evaluate obsolescence in digital preservation. She originally trained as a physicist at the University of Bath, and worked at several internet start-ups, enterprises and research departments in England, France and Germany. She returned to Bath for an MSc in Human-Computer Communication, which led to research roles in web engineering, robotics and computer vision. Subsequent research included applications of machine learning to digital library problems and to study of the social web. She holds a PhD from the University of Bristol in Computer Science on the topic of context-awareness via social and physical sensors, as well as Open University qualifications in modern languages and classical studies.

SkillsCast

The use of data drawn from the social web or from large sets of textual resources consistently carries with it the challenge of unpredictability. This talk explores some of the issues with mining and processing of natural language information encountered during recent research projects exploring aspects of social activity on the Web. In particular, issues arise from domain-specificity and social contextualisation, which lead to difficulties in applying machine learning systems across domains and over time, such as potential variation in accuracy.

YOU MAY ALSO LIKE:

Thanks to our sponsors

About the Speaker

Omnia mutantur, and other data mining pitfalls

Emma Tonkin

Emma Tonkin is a researcher at King's College London, where she is currently researching the application of time series analysis to evaluate obsolescence in digital preservation. She originally trained as a physicist at the University of Bath, and worked at several internet start-ups, enterprises and research departments in England, France and Germany. She returned to Bath for an MSc in Human-Computer Communication, which led to research roles in web engineering, robotics and computer vision. Subsequent research included applications of machine learning to digital library problems and to study of the social web. She holds a PhD from the University of Bristol in Computer Science on the topic of context-awareness via social and physical sensors, as well as Open University qualifications in modern languages and classical studies.