January 23, 2017
For the past 4 months I along with my team have been working on developing our capstone project. The project is focusing on developing a machine learning system to detect any intrusion attempts into a network by focusing on host based intrusion detection. In recent years, there have been significant security breaches such as the Yahoo hack and thus this project is particularly important. As part of a project we are working with a security company based in Toronto which acts as advisors. To build the actual system we are using a publicly available dataset. This post focuses on the design of the front end user interface. Certain user interviews were conducted along with my team but all conclusions made, and any design decisions made are my own and are being used for the purposes of SYDE 542.
The users being focused here are corporations who manage their own network with enough data to run such a system. Beyond that the primary users being focused on are security operations command center operators.
This post focuses on the utilization of user interviews, literature review/research and designing information hierarchies (using card sorting). All the notes are available at user Interviews and card sorting.This post focuses more on the results from these design processes and their impact on any designs proposed.
Due to the complex nature of the product being built, the number of users accessible to me in order to interview were fairly limited. Due to this, I heavily relied on interviews conducted with a colleague with experience working in network security (Huzaifa), a colleague with experience working for a security startup and as a machine learning specialist (Erik) and lastly our industry contact at Lyrical Security (Chyaim). Due to the diverse background of the users being interviewed, each interview focused on slightly different aspects of the product. Interviews with Huzaifa and Chayim focused heavily on the type of data required by network and security specialists whereas the interview with Erik was far more focused on developing trust between users and a machine learning system due to his experience building ML systems at 500px. Apart from these interviews I also relied on some of my experiences working as an Software Infrastructure Engineer at Shopify in order to note features required by infrastructure engineers in order to deploy and maintain such a system.
By conducting user interviews, I was able to collect information regarding various aspects of the project. From the perspective of networking, I was able to understand what data is particularly important to an expect user. Date like time, source and destination IP address, port etc is important. Our network expert user reacted positively to the idea of data visualization regarding statistics like number of attacks, number of connections etc where as our security expert actually recommended against focusing on data visualization. Our security expect has been in the industry for a large period of time and based on his experience, data visualization has not proved useful especially in existing solutions. In his experience, its more important to focus on displaying the data that can help a security operator either to detect (or confirm) attack or be able to find ways to make the system secure to such attacks. According to our security specialist we should focus on building simple visualizations like number of good or bad events, total vs good and bad events etc. Unfortunately our network specialist was not able to give us the name of the specific applications he used during his work due to NDA concerns.
After talking with Erik regarding his experience as a machine learning specialist at 500px, I was able to get insight into the perception of machine learning system held by other developers and non developers. From his experience most developer know that machine learning is simply math where as non developers usually tended to treat machine learning as magic. I was also informed that building or showing a metric with the system performance to users could prove useful for highly specialized systems. This is because, the system's users need to have knowledge in order to interpret the performance metric and that would not work for a software used by large number of non expert users. From Erik's experience non informing users that the system relies on machine learning can also prove as a sufficient way to improve trust in the system.
In order to better understand the implications of such a system review there was extensive research done regarding the machine learning aspect of the system. For the purposes of this post, I will only focus on literature review done for UI aspect of the project. There was a recent paper published by a team from the University of Toronto about “Adapting Level of Detail in User Interfaces for Cyber Security Operations” [link]. The paper talks about building a user interface particularly aimed at delivering large amounts of data with minimal added cognitive workload for security operators.The paper focuses on the importance of adapting the level of detail the information is displayed on the basis of user actions and motives. For the purposes of the paper this is being done by developing machine learning models to automatically adjust the level of detail based on user actions. Instead of relying on machine learning, my proposed UI relies on giving the user the ability to change the level of detail based on their needs. This is done using manual interactions to make sure the UI behaves consistently and to promote user trust in the system. The reason machine learning was not incorporated into the front end was based on an interview conducted with a machine learning colleague of mine and his personal experience which people’s apprehensions about ML systems in a working environment.
Card Sorting
The raw data collected for card sorting can be accessed at the following link. Due to the system being aimed at expert users and the lack of access to such users, mostly novice users were asked to completed the card sorting activity. The main result that I observed was due to lack of understanding of certain terms taken for granted by my capstone group, certain testers were unsure regarding their meaning which led to it being often classified in the wrong section. The major take away from the activity was to use very explicit language when conducting the activity. As an example, using the phrase "Deep Dive" as a section caused confusion as to the meaning of it. Similarly mentioning "IP address" without mentioning the fact that it was specific to an event, it was classified in the "Overview" section which from a system perspective is not viable. The rest of the results were around 70% in agreement on average. For future references, using very explicit language is a must to get better results. Also, in an ideal case, this activity would be performed with an expert user which was not viable in this case.
March 11, 2017