Projects

Grit Classification and Analysis from Social Media| (2018-Present)

Grit is a prominent concept in psychology used to measure an individual’s passion and perseverance for long-term goals. Grit scores are extensively used in practical psychology, academic research and motivational consulting . This concept roots from the perception that zeal and persistence of motive play a key role in determining an individual’s success in the long run, as opposed to natural talent, but how does one identify and distinguish between gritty and non-gritty individuals? Do they behave differently? In this project, we seek to answer these questions using a social media setting since people’s disposition and personality are mirrored on their social media posts. We build a new crowd-sourced Twitter corpus that contains 503 users’ twitter posts and corresponding Grit Scale scores. We then train a novel hierarchical Bi-LSTM model to predict the grit-level of an individual using his/her Twitter posts. Finally, we present two use-cases of our model- (i) Analyzing how Grit varies across students in differently ranked universities and (ii) generating a Grit-distribution map for the United States.

Aggression Detection in Social Media- Microsoft Research Project | (2017)

As the interaction over the web has increased, incidents of aggression and related events like trolling, cyberbullying, flaming, hate speech, etc. too have increased manifold across the globe. While most of these behaviour like bullying or hate speech have predated the Internet, the reach and extent of the Internet has given these an unprecedented power and influence to affect the lives of billions of people. Therefore, it is of utmost significance and importance that some preventive measures be taken to provide safeguard to the people using the web such that the web remains a viable medium of communication and connection, in general. In this work, we discuss the development of an aggression tagset and an annotated corpus of Hindi-English code-mixed data from two of the most popular social networking / social media platforms in India – Twitter and Facebook and the development of a classification system to detect unratified posts. The corpus is annotated using a hierarchical tagset of 3 top-level tags and 10 level 2 tags. The final dataset contains approximately 18k tweets and 21k facebook comments and is being released for further research in the field. We have also modelled a deep-learning prototype that can automatically predict ratified (both aggressive as well as non-aggressive) linguistic behaviour from unratified (aggressive) ones using word and sub-word LSTMs. The prototype was later developed into a Google Chrome Plug-in,in order to flag unratified comments.
Publications:
LREC-2018 (yet to appear)

Understanding Psycho-Sociological Vulnerability of ISIS Patronizers in Twitter | (2017)

The Islamic State of Iraq and Syria (ISIS) is a Salafi-jihadist militant group that has made extensive use of online social media platforms to promulgate its ideologies and evokes many individuals to support the organization. The psycho-sociological background of an individual plays a crucial role in determining his/her vulnerability of being lured into joining the organization and indulge in terrorist activities since his/her behavior largely depends on the society s/he was brought up in. We have analyzed five sociological aspects – personality, values & ethics, optimism/pessimism, age, and gender to understand the psycho-sociological vulnerability of individuals over Twitter. Experimental results suggest that psycho-sociological aspects indeed act as a foundation to discover and differentiate between prominent and unobtrusive users on Twitter.
Publications:
ASONAM-2017

Semantic Interpretation of Social Network Communities | (2016-2017)

In network science, a community is considered to be a group of nodes densely connected internally and sparsely connected externally. However, the semantic interpretation of a community is hardly studied. In this project, my team attempts to understand whether individuals in a community possess similar Personalities, Values and Ethical background. Finally, we show that Personality and Values models could be used as features to discover more accurate community structure compared to the one obtained from only network information.
Publications:
AAAI-2017
CSCW-2017
IEEE Intelligent Systems-2018
Information Systems Frontiers-2017

Predicting the Values and Ethics of Individuals by Analysing Social Media Content | (2016-2017)

To find out how users’ social media behaviour and language are related to their ethical practices, the paper investigates applying Schwartz’ psycholinguistic model of societal sentiment to social media text. The analysis is based on corpora collected from user essays as well as social media (Facebook and Twitter). Several experiments were carried out on the corpora to classify the ethical values of users, incorporating Linguistic Inquiry Word Count analysis, n-grams, topic models, psycholinguistic lexica, speech-acts, and nonlinguistic information, while applying a range of machine learners (Support Vector Machines, Logistic Regression, and Random Forests) to identify the best linguistic and non-linguistic features for automatic classification of values and ethics
Publications:
EACL-2017
ICON-2016

Figurative Language Analysis | (2016)

Figurative language is language that uses words or expressions with a meaning that is different from the literal interpretation. Figurative language is used with a meaning that is different from the basic meaning and thatexpresses an idea in an interesting way by using language thatusually describes something else. Therefore, one of the greatest challenges in computational linguistics is figurative language processing, since the words or expressions used possess ameaning that is different from the literal interpretation. In my summer internship at NTU Singapore, I worked on analysis of various elements of figurative language. My team also developed an automatic satire detection system, we published a research paper on the works carried out in ICDM-Sentire-2016
Publications:
CSCW-2017
ICDM Sentire-2016

Sentiment Analysis of Code-Mixed text | (2016)

Sentiment Analysis seeks to identify the opinions and viewpoints communicated in a given piece of data which is generally in the form of text. In the recent years, there have been many attempts to classify texts from various sources based on their polarity. However, a major challenge in analyzing textual data is Code-Mixing. Especially, in a multilingual country like India where about 22 official languages exist, Code-Mixing is very prominent. For example, many native languages are Code-Mixed in English script. In this project, my team attempts to provide a sentiment analysis of Telugu, Tamil and Hindi social media textual content obtained from various kind of social media sources like Twitter, Facebook e.t.c. The model will classify a given text into positive, negative and neutral.

Personality Detection from Social Network Profiles | (2016)

According to statistics Facebook is the 2nd most popular and Twitter is the 10th most popular website now! Probably the meaning space of social-status and Facebook/Twitter status is coming closer day by day. There could be a perpetual debate on whether digital representations of us on Facebook/Twitter can capture much about human social relations, but the increasing popularity of these sites and data made urgency to develop technology to manage this information more intelligently than ever. With that necessity in mind the goal of my present research is to assess personality(Openness (O),Conscientiousness (C),Extraversion (E), Agreeableness (A), Neuroticism (N)called Big Five Model)of any user from his/her Facebook/Twitter interactions.

Values/Personality Community World Map | (2016)

To understand how someone’s personality and intrinsic values change with geolocation and city we intend to perform several experiments, the final outcome of which will be a map to represent geo-specific values.In order to create the World map, we intend to collect data from 40 most popular cities around the world. We will also be collecting the network structure of atleast 2000 users from each city and determine the Values and Personality and checking community Variations all over the world. This values/ethics map would provide an overview of the kind of values & Personalities possessed by people from different regions and community structure.

Web Portal for University Hostel | (2015)

Developed a Hostel portal for the University to facilitate hostelers using web2py framework and SQLite database. User Interface was developed using Java-Script, HTML and bootstrap.

Home Automation System with user Face Recognition | (2015)

Devised a Home Automation System which validates user on the basis of face recog- nition

Quad Control Robotic Arm for Assistance of Physically Challenged(Android-Based) | (2015)

Developed a prototype Robotic Arm for assisting aged and Physically Challenged people. The Arm uses Wi-Fi and Bluetooth as communication networks and works on four control mechanisms i.e., Remote, Smart Phone Tilt, Voice and Hand gesture recognition. User Interface is provided by self developed Android Application