From Raw Text to Machine-Readable and Human-Interpretable Information

Bishan Yang

Everywhere we look on the internet we see text, it is inescapable.  Attempting to read and understand even a small fraction of that text is daunting.  Bishan Yang takes an automated approach with the use of Natural Language Processing.  In the past few years people have had success identifying things such as people and objects as well as simple relationships between them.  However, identifying complexities in text such as opinions has not seen the same amount of success.  This is where Bishan Yang focuses her work.  She is able to identify opinions, who holds the opinion, what the opinion is on, and whether the opinion is positive or negative and how strong.  Her method demonstrates improvements over all baselines with some improvements being quite considerable.  I am interested to see if techniques such as this one can be used to extract public opinions and reactions.  Most of Bishan Yang’s performance accuracies where around fifty percent, however I do not think we are too far from making this kind of an application a reality.

From Raw Text to Machine-Readable and Human-Interpretable Information

Applying an Interactive Machine Learning Approach to Statutory Analysis

Jaromir Savelka

Determining if a statute or provision applies underlines statue analysis and almost all research on legal issues.  In his talk, Jaromir Savelka presents a method for the automation of this task with the use of Machine Learning.  More specifically the method uses an interactive machine learning framework which means they train a model (SVM-classifier) with a human expert giving feedback.  Such a model will allow them to quickly search for and judge whether or not a statute applies to specific cases.  One of the main issues they ran into with their model was the cold start problem.  Since their model relies on feedback, early on before the system has had much or any feedback, it performs poorly.  To fix this they tried re-using previous knowledge in hopes that it would fix the cold start problem and achieved promising results with this method.  I was a little confused by the solution they proposed because by re-using knowledge it seems that they are simply building off of previous simulations where the cold start problem was present.  Why would you ever not use not use knowledge gained from before and isn’t the cold start still present in the first simulation?  However I have little knowledge and understanding of active learning and statutory analysis.  I am interested to learn where my misunderstanding lies in this circumstance because cold starts are often a big issue in many aspects of computer science and finding a fix is important.  Jaromir concluded by saying that they have shown their method can provide reasonable suggestions (AUC of 0.8), that they can eliminate the cold start problem with the re-use of knowledge and that they can improve accuracy with addition of more documents.  In future work they hope to implement more active learning techniques in hope of improving performance (identify if specific documents are good or bad).

Applying an Interactive Machine Learning Approach to Statutory Analysis

Causal Learning in Signaling Networks

Presented by Karen Sachs, Ph.D

In Causal Discovery, in order to make causal predictions from data alone, we must have enough data to establish relationships such as conditional independence between variables.  Karen Sachs proposes using single cells as data.  By doing so, the information from each cell acts as a column of data in a large matrix thereby giving enough data for effective causal discovery.  She finds that by using this method she is able to reproduce cell signaling diagrams from textbooks with remarkable accuracy, however the results are not perfect and she also discusses flaw in her approach.  One important issue which I was not aware of beforehand involves the measurement of the data.  When measuring cell based data, we cannot “take a movie of it” as Karen Sachs puts it.  When dealing with cell data the best we can do is take a snapshot of the amounts of each protein.  This is problematic because we much destroy the cell to do so which means we cannot take a snapshot of the same cell twice.  Ideally we would like to watch of the levels of each protein change in a cell over time, but with only snapshots this is difficult to do.  We know we have time sensitive data, but we have no idea what the ordering of the data should be because it all came from different cell which are at different phases of signaling.  Even with a perfect Causal Discovery algorithm, there will be errors because of this.  I believe that using single cells as data is a phenomenal idea, however the nature of the data makes discovering true causality difficult.  I am very interested to see what countermeasure are taken next to try and circumvent this issue.

Causal Learning in Signaling Networks

Data-Driven Healthcare: Visual Analytics for Exploration and Prediction of Clinical Data

Presented by Adam Perer

Hospitals have a wealth of patient data in the form of Electronic Medical Records (EMR), in Adam Perer’s talk, he describes a few tools he and his team have been working on to use and understand that data in order to better diagnose and treat patients.  CarePathFlow, the first of the tools presented, treats patients as a series of clinical events in order to look at the flow of treatments and outcomes and hopefully make sense of them.  When using the tool, the most relevant (similar) patients are used to create pathways between outcomes with treatments as the edges between them.  Thicker lines represent more common pathways and the color of the lines represents the average outcome of patients for the path in question.  The next tool Perer presents is a query refinement visualization tool known as Coquito.  The idea here is that when data scientist want to train a classifier, they must first choose what data they want to use (for EMR this is normally a time consuming process).  Coquito allows scientists to quickly query and visualize what data exists in the EMR to make selection of data much quicker and easier.  Next Perer showed off a tool called Infuse.  Infuse is a tool designed to visualize feature selection and classification.  This tool chooses features and evaluates them using four different types of feature selection algorithms with tenfold cross validation and then wraps the results into a circle for spatial reasons (the more filled in the circle the more relevant the feature).  Further, Infuse uses four different classification algorithms to evaluate all these choices of features.  Infuse allows a scientist to visualize many different choices of features and classifiers all at once and pick which he/she deems best.  Lastly Perer presents Prospector, a tool that tries to help scientist understand how features interact under the hood.  Prospector searches for partial dependencies by tweaking feature values and visually showing how the model reacts.  These tools are important steps forward in helping scientists understand data (specifically EMR) quicker and more completely and will hopefully result in improved treatment of patients.

Data-Driven Healthcare: Visual Analytics for Exploration and Prediction of Clinical Data

DESIGNING ADAPTIVE USER TECHNOLOGIES THROUGH AUTOMATING COLLABORATIVE EXPERIMENTATION: COORDINATING PRACTITIONERS AND RESEARCHERS IN HCI, PSYCHOLOGY, EDUCATION, AND STATISTICAL MACHINE LEARNING

Presented by Joseph Jay Williams

In human to human interaction, no two explanations are alike.  People are constantly modifying what they say, experimenting with different wordings and analogies and optimizing their approach for maximal understanding in the receiving parties.  However, something as simple as personalized explanations, which we do subconsciously, proves to be not so trivial for computers.  Websites for the most part are one size fits all displaying the same information for everyone.  But what if they didn’t?  Joseph Jay Williams presents his work on intelligent adaptive agents that implement machine learning algorithms to dynamically discover how to optimize and personalize user requested content.  In his case, he is working on created personalized explanations for math problems that optimize the users learning and understand of problems.  This concept fascinates me and I really think that it will turn into something big.  Many studies have shown that people respond better when they feel that an explanation or message has been created personally just for them.  Joseph Jay Williams is a long way away from creating something as revolutionary as a personalized browsing experience, but I believe he and his team are making the right first steps in pioneering the way.

DESIGNING ADAPTIVE USER TECHNOLOGIES THROUGH AUTOMATING COLLABORATIVE EXPERIMENTATION: COORDINATING PRACTITIONERS AND RESEARCHERS IN HCI, PSYCHOLOGY, EDUCATION, AND STATISTICAL MACHINE LEARNING

PARSE TREE FRAGMENTATION OF UNGRAMMATICAL SENTENCES

Presented by Huma Hashemi

Sentences are not always grammatically correct, in fact I would wager that most sentences on the internet has some form of grammatical error.  Huma Hashemi presents work on training a parser regarding grammatically flawed sentences in order to uncover the hidden structure of such sentences.  This allows machines to gain an understanding of problematic sentences which is very important in tasks such as machine translation.  For data, she uses sentences from English as a Second Language students that have been corrected by a teacher and machine translation texts that include human edits.  The proposed parser involved generating a parse tree and creating fragments of reasonably isolated parts.  In fluency judgment, her method held its own against other leading methods.  I believe this is an interesting approach to an important problem and I am excited to see where future work takes her.

PARSE TREE FRAGMENTATION OF UNGRAMMATICAL SENTENCES

Unsupervised Deep Learning Reveals Prognostically Relevant Subtypes of Glioblastoma

Defended by Jon Young

Deep Learning has had much success in days of late all over the field of Artificial Intelligence due to its ability to represent complex relationships and to self-learn features.  In past work, it has been shown that Artificial Neural Networks can take unannotated high dimensional input and learn relevant lower dimensional features.  Jon took such an approach in Cancer Biomedical Informatics.  He hoped that given high dimensional cancer data as input, a lower dimensional representation could be discovered that would facilitate the understanding of cellular signaling pathways that drive the disease.  Unfortunately, with such high dimensional data as input, computation can take months.  Jon reduced the input by removing information relevant to cell type (cell type features would yield uninformative results) and by an unsupervised method know as filtering by variance.  The resulting low dimension representation performed worse than Jon had expected, however understanding cancer is a notoriously difficult problem (or else cancer would be well understood and solved) so such results were not unexpected.  More importantly, the results were statistically better than the baseline which proved understanding can be derived from such methods.  Through clustering Jon discovered the results were largely explained though cell type which eclipsed the signal he was interested in.  In future work he hopes to remove the confounding cell type variable.  I am curious about the performance of this method when the input hasn’t been filtered by variance.  I understand theoretically features that don’t vary much are relatively uninformative.  However, isn’t it possible that some feature removed via this filtering method are in fact informative and just sensitive (i.e. don’t need to vary much to have an effect).

Unsupervised Deep Learning Reveals Prognostically Relevant Subtypes of Glioblastoma

Unifying Logic and Statistical AI

Presented by Pedro Domingos

Computer scientists have applied both logical and statistical methods to problems across every area of Artificial Intelligence.  These two approaches have their strengths and weaknesses and are inherently different making them difficult to unify.  Pedro Domingos promotes an Artificial Intelligence method with this goal in mind.  Markov Logic combines Markov Networks and First Order Logic.  First Order Logic sentences are assigned values based on the probability of their occurrence and used as weights in a Markov Network.  The First Order Logic aspect accounts for the complexity of the world while the statistical side takes care of the uncertainty in the world.  Pedro Domingos claims this method is a candidate for what he calls the interface layer of Artificial Intelligence (an abstraction that would allow for simple interaction with the complexities of Artificial Intelligence).  A successful implementation of the interface layer would lead to a general increase in the rate of progress for the field as a whole.  While I believe Markov Logic offers a powerful and unique approach to Artificial Intelligence, I am hesitant to believe that it can act as the interface layer that Pedro Domingos claims it will.  Markov Logic has been around for nearly ten years now and doesn’t seem to have the backing an algorithm that will revolutionize an entire field of Computer Science should.  In general, I don’t believe that we will find a master algorithm that eclipses all other methods.  However, I have little doubt that Markov Logic will play an invaluable role in certain sub-domains of Artificial Intelligence.

Unifying Logic and Statistical AI

I Think Therefore I Am: The Computability of the Mind – A Lecture from Computing: The Human Experience

Presented by Grady Booch

Is the human mind computable?  Well that depends on what the human mind is and how it works.  Certainly if the mind has a spiritual component all the circuits in the world won’t be able to replicate it.  However, if it is purely material, there is no reason why we won’t be able to eventually create an artificial one.  Fear surrounds the idea of an artificial mind which Arthur C. Clarke’s “A Space Odyssey” illustrates through the artificial intelligence known as HAL.  Grady Booch claims such an agent will be a necessity in fifteen or twenty years when humanity moves to mars.  He goes on further to state that we should not fear domination via artificial intelligence and what truly frightens us regarding the singularity lies somewhere completely different.  If machines grow a conscience and intelligence to rival or even surpass our own, what will happen to our humanity?  What will define what it means to be human?  Surely such a development couldn’t happen overnight, there will be a journey.  Grady claims that this journey will fundamentally change us and by the time the singularity is achieved we will be something more.  Whatever we are, it will be human and we will have our humanity.

I Think Therefore I Am: The Computability of the Mind – A Lecture from Computing: The Human Experience

Multimodal Machine Learning: Modeling Human Communication Dynamics

Presented by Louis-Philippe (LP) Morency

Between people, effective communication proves essential in any collaborative effort.  Why then do we still relay on a keyboard and mouse to interact with computers?  Louis-Philippe explores solutions to this problem via multimodal machine learning.  In particular he seeks to unlock the subtleties of human to human interaction in the eyes of machines.  Using established techniques from Natural Language Processing in conjunction with those of Computer Vision, he combines verbal, vocal, and visual information and has successfully interpreted humans in areas such as healthcare and business.  Doctors over the world perform around 40 diagnoses a week on teenagers exhibiting suicidal tendencies.  A course of action for these patients requires a decision after one meeting short meeting; multimodal machine learning can assess a patient and quantify depression to aid doctors in such choices.  In experiments, not only did multimodal successfully identify depression, but it also found tendencies of people suffering from depression.  In business interviews, multimodal machine learning successfully predicted whether or not a job candidate would get an offer and if so, whether or not they would accept.  I am interested in seeing how this technology performs in lie detection when compared with current methods as well as whether or not they can provide more detail than simple truth classification.  For instance, could multimodal machine learning provide emotion identification to further explain the truth or lie?

Multimodal Machine Learning: Modeling Human Communication Dynamics