The Watsons

"Meet the Watsons" are a set of four projection mapped sculptures who discuss and deconstruct a twitter account live in front of an audience, looking to make you question what you put online, who can access it and the possible tone of your communications. I wanted to expand upon a subject that I looked at previously with another piece 'You probably live in Horsham'1, specifically exploring the security and availability of our metadata2 and what can be derived from looking at it in detail. Asking in a very humanised manner, do you know what you put online? If so, do you think it should be public?

Accessibility was a key thought with this piece as I wanted the Watsons to be accessible to all including the hugely differing demographics of who uses twitter. Therefore, I had to find an outlet for my findings that was all-inclusive. For this I turned to creating physical representations of humans as a way of voicing my own concern, they provide a non-threatening almost homely reminder to the user, guiding them through each step of what they could gather by looking at the tweets. It also acts to remove both the reading and language barrier (as the Watsons can speak other languages) seen in "You probably live in Horsham". In addition to this, the human face helps the users interact more freely with the piece connecting somewhat to the overly cute and perfect personas of the Watsons, something specifically considered in order to create something relatable and kind. I actually used what I would determine to be a representation of corporate America to help form the personality for the Watsons, choosing thick American accents and a typical white middle-class family as the template, something I find to be very typical of the capitalist sector. This cute image, however, is juxtaposed by the thick, dark harrowing shadows cast behind the piece which helps represent the true, much more sinister side behind the 'cute' image.

The Watsons are comprised of six parts; a set of four sculptures, a projection keystone application, a language analysis system, text to speech, scripting engine and a front-end system allowing the user to interact with the piece. All of which were carefully created to make the interaction between the user and the piece seamless and incredibly simple, something essential in a gallery situation. However, it was key that the simplicity should never compromise the detail of the information. Therefore early on I decided not to use text analysis API's such as the IBM Watson platform which I have used before and instead create a completely bespoke system designed specifically to analyse tweets in the given context. This was a long laborious process but yielded successful results using a mix of my own technology, a few research papers from institutions including Stanford and a few php wrappers by Ian Barber a Developer Advocate for Google+.


Above: The process of creating the Watsons, left: inner structure, middle: rough shape, end: finished sculpture.


Sculptures

For the sculptures, I chose porcelain as the material as this much finer clay has surprisingly strong properties but also leaves a bright, flawless white matt finish once fired, making the material perfect for projecting as it would disperse the bright light perfectly. All four statues were created in ~12 hours non-stop as leaving gaps between the creation of the pieces could lead to inconsistent drying. This could be disastrous as differing levels of moisture in the clay could result in the pieces exploding once fired in the kiln. Thankfully this seemed to work as once fired the pieces came out without any issues whatsoever.

The level of detail in the sculptures varies according to the amount of animation projected onto them. For example, the boy seen above has both his eyes and mouth simultaneously projected alongside a full face painting. As this could potentially alter the face shape quite significantly his eyes and mouth have been smoothed allowing the projection to dictate the shadows and highlights of his face, allowing me more freedom in the look of the projection. The baby, however, has minimal animation so both it's eyes and mouth are fully formed.

I had three main inspirations when forming the look of the Watsons; one being Julian Opie's Large portrait sculptures8 which has this amazing smoothed aesthetic, rather suiting of projection in fact. Another is probably one of my the earliest memories of projection mapping which unusually was displayed during a Lady Gaga concert (2011), the projection mapped face dubbed "Mother G.O.A.T" was a fantastic example of how projection mapping when done correctly can be incredibly realistic and was likely the precursor to my idea of the Watsons. It also helped me form the faces of the Watsons so the projection mapping was as natural looking as possible. I believe "Mother G.O.A.T" was created by Nick Knight, however, the exact details of its creator cannot be found.


Technology

As the technology behind the Watsons is quite substantial I will explain it in the order as executed by the piece when a user interacts with the system,

1. Front end system
To initiate the interaction with the Watsons the user goes to the site joe.ac on their phone which will redirect them to a portal allowing their name, twitter username and pronoun to be inputted. When submitted their phone will become a ticket sending their username for analysis and placing them in a queue, once analysed (after ~20 seconds) and at the front of the queue the Watsons will begin talking to them with their phone acting as a live transcript showing exactly what the Watsons are saying. This system is programmed in Javascript using AJAX for communication with the database and PHP to handle the caching of the script etc. The following happens during the analysis stage.

2. The analysis engine
When the user engages the front end system the analysis engine follows a simple but process heavy structure (note: it typically takes around 1 second for a tweet to be analysed in it's entirety), this process is completely programmed and executed in PHP:

  1. Firstly take the username of the twitter account (fed to the process by the front-end system and input by the viewer) and get the 10 most recent tweets using a php wrapper of the twitter fabric api3.
  2. These tweets are analysed looking for embedded geotags or specific place names extracting this data before studying the frequency of locations guessing the likely location for the user to live.
  3. After this, the tweets are then fed through a process that looks at usernames mentioned in the tweets, user description and tweet content to try and find a link to a university or school, it'll look for specific keywords and attempt to extract the name of the potential university alongside their role e.g. student or lecturer.
  4. Next, the full tweet strings have their sentiment analysed to determine if they are positive, negative or neutral. This gives a general understanding of how the content is viewed by the user and potentially reveals relations with specific users. The system includes James Hennessey's4 implementation of the 'naive bayes' algorithm to help detect sentiment. Before the algorithm is run a list of noise words (e.g. and, a, of, me) are removed from the phrase to reduce the chance of skew.
  5. After the sentiment is analysed keywords are extracted from the tweets by removing stop words (e.g. afterwards, or, and) compiled from a list by DarrenN from Github5, once removed we are left with the most important keywords from within the tweets allowing us to use the sentiment determined previously to gather an understanding on whether the user likes something or someone. E.g. the user hates 'Southern Rail' and 'Strike', giving us a good idea of what the user means within this tweet.
  6. Now we have the keywords and sentiment of the tweet we parse the entire unchanged string through a POS (part of speech) tagger, this system allows us to determine what words are the most important by labelling them by their grammatical categories e.g noun or verb. This allows us to very accurately determine what words are the most relevant in the tweet. This is matched alongside the keywords gathered from stage 5 and alongside the sentiment is sent to the scripting engine.

3. The scripting engine
After a full analysis of the user's profile has been undertaken the data gathered is sent to the scripting engine which is generated dynamically, however it follows a basic structure if applicable. This system is also programmed in PHP:

  1. The characters greet the audience explaining who they are and what they do, this is selected from ~455 different combinations.
  2. The Watsons begin to talk about where the user lives pulling live geo-relevant data regarding the area around that location to provide contextual relevance (e.g. 'oh he lives in Horsham, we often walk the dog in the Horsham Park'). This is an attempt to make the user feel less at ease hopefully making them listen more intently. The location data is gathered during the analysis engine stage and the live contextual relevant data is kindly provided by GeckoLandmarks who gave me an incredibly generous free licence.
  3. They then explore the links found to any universities.
  4. Then moving on to the specific tweets analysed looking for negative or extremely positive skewed responses.
  5. After completing this it'll thank the user for taking part before saying goodbye.

The entire conversation's duration varies depending on the quality of the content however, it will typically last around 1 and 1/2 minutes. This compiled script is then saved temporarily in a JSON format and the mapping program is notified that the user can be accepted if at the front of the queue.

{"success":1, "script":[{"character":"Sarah","text": "Hey+Joe+it%27s+great+to+see+you", "eyes":"l", "plain_text": "Hey Joe it's great to see you", "pause_count":0}]}
Above: an example of the JSON formatted script.

4. The Watsons projection application



Mapping system:
The program which is written entirely in Processing and Java has a very specific mapping system created based on the now defunct "Keystone" framework by David Bouchard in 20139. I took this framework, altered it to best suit the animations and multiple projecting within my application and used it to help map the faces created in Photoshop to the sculptures. The textures were created by hand in photoshop using photos of the finished porcelain sculptures. I took photos of all 4 at the same height and angle, then I would use photoshop and photos of friends, families and royalty free images of celebrities to help compile a face that perfectly fit the features of the piece. This image would then be broken down into a face, eyes, teeth and mouth which could be then layered upon each other and then animated to make them move.

After the application has loaded and had it's textures mapped to the sculptures the program enters a searching mode where it will look for completed scripts and then begin reading them. Once a user has been found the program will tell the user's phone the session has begun and the script synchronised between the two programs. The program will send each line's encoded string to the speech analysis server held at joemcalister.com.

Hey+Joe+it%27s+great+to+see+you"
Above: an example of an url encoded string.

The text to speech service is provided by Amazon's IVONA service6 and is streamed from their datacenter in Ireland to my server in Germany, chunked encoding7 is used during this process to reduce the latency between the audio being generated to milliseconds, allowing speech to flow so smoothly. However this means no duration is logged in the HTTP request, therefore, it is not as simple to determine when speech has ended. To solve this I created a class in processing that uses minim and an FFT to monitor the amplitude of the incoming streamed audio, looking for drops to 0.0 and logging them alongside the pause count found within the JSON encoded script. This allows the program to accurately determine when the speech has ended and the next line of the script should be read out.

The application simply reads out the entire script using this process and once at the end a termination signal is sent to the user's phone sending a goodbye message and the search process restarts looking for the next user.


Final words

I'm incredibly happy with the response the Watsons received during the Symbiosis exhibition, nearly 300 people interacted with the piece over the two days and many more photos and selfies with the Watsons were taken. In particular, I was pleased with the shock at the accuracy and regularity of the information quoted by the Watsons with many since remarking that they will make their twitter account private from now on. As well as this despite the huge overwhelming crowd that surrounded the Watsons the mapping application never faltered, only failing interaction-wise 3 times, once because a twitter rate limit was exceeded (after 30 requests in less than 15 minutes!), the second time Eduroam disconnected and third IGOR (the departmental server) of which the Watsons were hosted had a small timeout issue none of which were due to my software, showing my fail safes built into the program worked perfectly. This was a big achievement for me as I had to largely predict the demand the Watsons would see and my estimations were greatly under the final figure.

Therefore, after talking to many people when experiencing the piece I believe my core message of realising the intimacy of what we put online was heard, but more importantly heard in a pleasant and understanding way, with fewer shock tactics used and instead careful considerate questions being proposed.


Exhibitions and awards

The Watsons have been displayed at the Symbiosis Exhibition (Goldsmiths, University of London) - 28th April 2016 and the Computing degree show 'Generation' (Goldsmiths, University of London) - 2nd June 2016.

They also won the 'Best Creative' award at the Computing Degree show 2016.

"It was clear that this project had a solid conceptual centre but also a really robust delivery method made up of layers of different technologies that remarkably coalesced into a single engaging experience for the audience." - Unthinkable Digital's Justin Spooner on The Watsons


Sources

1 You probably live in Horsham: link
2 Metadata: link
3 Twitter's Fabric API: link
4 James Hennessey: link
5 DarrenN stopwords list: link
6 IVONA: link
7 Chunked encoding: link
8 Julian Opie's Sculptures: link
9 David Bouchard's Keystone: link