training common sense

[[training-common-sense-day-2]] (friday)
[[training-common-sense-day-3]] (saturday)
[[training-common-sense-day-4]] (monday)
[[training-common-sense-day-5]] (tuesday)



earlier notes:

the Annotater @ Cqrrelations, Jan. 2015

git & the critical-fork-folder
to clone the Pattern 2.6 critical-fork folder


* shift --> 

* co-intelligence between humans and machines
* there is a lot of un-rawness to this data 

* repeat the model untill the model is true

- example: history of western classical music doesn't involve woman, a comment was: but i looked, there are no 
- example: alternative scientific truth searching methods --> work on (non) common sense valid truths

* [how]? [you]? [write]?


from: the article Toward personality insights from language exploration in social media

door ut te betalen

machine categorisation of positive / negative words , from the article: Personality, gender, and age in the language of social media: The Open-Vocabulary Approach

looking at applications

Facebook messages Gender/Age profiles

Hedonometer -->

going to Knowledge Discovery in Data steps, by looking at Pattern

step 1 --> data collection
step 2 --> data preparation
step 3 --> data mining
step 4 --> interpretation
step 5 --> determine actions

created at CLiPS, a research project at the University of Antwerp

annotating the presentation/

dealing with large amounts of data and triyng making senso of that

big data: the idea of looking at knowledge through categories
but its also raw data, which machine can 'read' through without thinking in relation to these categories (looking for truth)

the crudeness of the presented results: what does that mean? (gender differences in social medias)
what does it mean?

we wanto to look at:
- data collection (thier technological as well as social context)
- data preparation (look at the rough raw data and its next 'use')
- data mining (looking into mathematical models, graphs; the translation from linguistic models to mathematical clusters; looking at different projections, what are the differencies? the 'suitability' of the models in relation to the outcomes)

what we want to do:
- look in actual text mining programs, look and discuss what happens
- look at documentation on the collection/ing of this datas (exuses, apologies about the political/social problematics expressed in the 'truthfullness' of the mathematical results) - revealing subtleties
- the research on the data mining is very close to the text generation project

example:smilies-->not recognized as meaningful
meaningfulness of cleaning data (we know it is-like cleaning toilets- but we don't do it)
use facebook data to train next mining process - 'self-fulfilling prophecy' of common sense
data-mining->"the" step, but actually just a small part of the process 
mathematical model--->truth finding moment-->the machine performance
with enough time, with enough data...

compare first results with standard, most of the time human-produced data
e.g. libraries of positivity/negativity online

The best material model of a cat is another, or preferably the same, cat.
Philosophy of Science  1945. (Wiener)

Project IMB: you are what your write on Twitter:

Suggested references
An article that poses questions and reflections on methodology applied in research. They obtain "valid cientific" results that have no common sense at all. The abstract is quite inspiring

ConceptNet aims to give computers access to common-sense knowledge, the kind of information that ordinary people know but usually leave unstated:
Unfortunately, common sense is not yet within reach for this 4-year old: