Byung-Gon Chun

My Recent Research Areas

My recent research focuses on building operating and networked system platforms for mobile systems, Big Data analytics, and cloud computing. I’m interested in conferences such as SOSP, OSDI, NSDI, SOCC, MobiSys, USENIX ATC, SIGCOMM, and USENIX Security. I present an overview of my recent research areas below.

Big Data Systems

Modern clouds run complex applications with varying characteristics. It is important to understand what the workload looks like and how it impacts application performance and operational efficiency. I have been investigating big data system designs driven by the broad and deep analysis of workloads.

Big Data Operating Systems

Current Big Data processing frameworks target specific data processing workloads. I have worked on a project that builds a common “operating system” layer for Big Data processing workloads. The layer supports common abstractions for different kinds of Big Data processing applications such as MapReduce, MPI, ML, and graph processing applications, and provides hooks for intelligently choosing what implementation to use.

[CCC+13] Byung-Gon Chun, Tyson Condie, Carlo Curino, Chris Douglas, Shravan Narayanamurthy, Raghu Ramakrishnan, Sriram Rao, Russell Sears, Markus Weimer. REEF: Retainable Evaluator Execution Framework (Demo Paper). In VLDB, 2013.

Workload-driven Big Data Analytics System Design
I have performed a detailed measurement study of production Hadoop clusters at Yahoo!. Hadoop is an open-source MapReduce implementation. We go beyond prior efforts by presenting a view that is both broad (i.e., considering all jobs in the cluster over an extended period of time) and deep (i.e., examining behaviors at multiple levels – users, jobs, tasks down to file system operations and physical resource utilization). We show that these workloads are extremely heterogeneous and predictable. While this heterogeneity has been observed in point cases in previous work, we show that heterogeneity of workloads is pervasive, occurring at different granularities and metrics (e.g., execution time, storage, input or output size, etc.). We also group jobs by the program that generates them, and show how recurrent and predictable they are. Based on the insights, we re-architect big data systems from the ground up.

Mobile-Cloud Systems

Mobile applications have become ubiquitous thanks to smartphones and tablets. These applications provide ever richer functionality with cloud services. I have worked on systems that serve data and computation efficiently between mobile devices and the cloud for performance and energy: systems that transparently and intelligently manage data (Mobius), and systems that dynamically adjust application execution between mobile devices and the cloud (e.g., offloading computation from device to cloud) (CloneCloud).

Intelligent Data Serving for Mobile Apps
Mobile application development is challenging for several reasons: intermittent and limited network connectivity, tight power constraints, server-side scalability concerns, and a number of fault tolerance issues. Developers handcraft complex solutions that include client-side caching, conflict resolution, disconnection tolerance, and backend database sharding. To simplify mobile app development, I have developed Mobius, which introduces MUD (Messaging Unified with Data). MUD presents the programming abstraction of a logical table of data that spans devices and clouds. Applications using Mobius can asynchronously read from/write to MUD tables. Applications can also receive notifications when tables change via continuous queries on the tables. The system combines dynamic client-side caching (with intelligent policies chosen on the server-side, based on usage patterns across multiple applications), notification services, flexible query processing, and a scalable and highly available cloud storage system.

[CCS+12] B.-G. Chun, C. Curino, R. Sears, A. Shraer, S. Madden, and R. Ramakrishnan. Mobius: Unified messaging and data serving for mobile apps. In ACM MobiSys, 2012.

Elastic Execution between Mobile Device and Cloud
CloneCloud addresses smartphones’ hardware limitations via seamlessly but partially offloading execution via thread migration from the smartphone to a computational infrastructure hosting a cloud of smartphone clones. The execution between a mobile device and the cloud is elastic and is optimized to improve performance or energy consumption by combining static analysis, dynamic profiling, and cost optimization. Specifically, CloneCloud improves the performance of applications from resource-starved devices such as smartphones, by opportunistically offloading them to available cloud resources in nearby datacenters or cloudlets. The system can support very expensive operations (e.g., image search, virus scanning, and data leak detection) (a) without requiring application designers to explicitly plan for cloning, (b) without draining the smartphone’s battery power, and (c) with significant performance improvement. Our evaluation shows that CloneCloud can adapt application partitioning to different environments, and it helps some applications achieve as much as a 20x execution speed-up and a 20-fold decrease of energy spent on the mobile device.

[CIM+11] B.-G. Chun, S. Ihm, P. Maniatis, M. Naik, and A. Patti. CloneCloud: Elastic execution between mobile device and cloud. In ACM EuroSys, 2011.
[CM09] B.-G. Chun and P. Maniatis. Augmented smart phone applications through clone cloud execution. In USENIX HotOS, 2009.

Predicting Application Performance Using Program Analysis
Predicting how applications will behave for a given input workload is key to helping users and operators better manage those applications. I proposed a new prediction framework that can automatically predict program performance with high accuracy by combining program analysis techniques with machine learning algorithms. The system extracts program execution features (information about program execution runs) through program instrumentation. It uses machine learning techniques to select features relevant to performance and creates prediction models as a function of the selected features. It shows a significant improvement over models that are oblivious to program features.

[KLY+13] Yongin Kwon, Sangmin Lee, Hayoon Yi, Donghyun Kwon, Seungjun Yang, Byung-Gon Chun, Ling Huang, Petros Maniatis, Mayur Naik, Yunheung Paek. Mantis: Automatic Performance Prediction for Smartphone Applications. In USENIX ATC, 2013.
[HJY+10] L. Huang, J. Jia, B. Yu, B.-G. Chun, P. Maniatis, and M. Naik. Predicting execution time of computer programs using sparse polynomial regression. In NIPS, 2010.

Secure Mobile Systems

Mobile devices such as smartphones, tablets, and glasses and cloud computing pose interesting security challenges: information leakage from personal devices and attacks on cloud services. I have been investigating securing systems by efficiently tracking information flow.

Detecting Private Information Leakage from Smartphones

TaintDroid is a smartphone system that provides users with adequate control over and visibility into how third-party applications use their private data in real time. The system is an efficient, system-wide dynamic taint tracking and analysis system, which can simultaneously track multiple sources of sensitive data. TaintDroid leverages Android’s virtualized execution environment. It provides real-time analysis by tracking taint information at the bytecode interpreter level and interposing operating system calls (e.g., IPC, networking, and file system I/O). TaintDroid incurs just 14% performance overhead on a CPU-bound micro-benchmark and imposes negligible overhead on interactive third-party applications. Using TaintDroid to monitor the behavior of 30 popular third-party Android applications, we found 68 instances of potential misuse of users’ private information (e.g., location information, device identifiers, and phone numbers) across 20 applications. TaintDroid informs users how their sensitive data is used by third-party applications and provides security service providers with valuable insights. TaintDroid source code is publicly available and it has been actively used by the system and security community.

[WGC+10] W. Enck, P. Gilbert, B.-G. Chun, L. P. Cox, J. Jung, P. McDaniel, and A. N. Sheth. Taintdroid: An information-flow tracking system for realtime privacy monitoring on smartphones. In USENIX OSDI, 2010.

I am also interested in improving mobile security with offline analysis and online protection approaches. Smartphones and “app” markets are raising concerns as to how third-party applications may misuse or improperly handle users’ privacy-sensitive data. Fortunately, unlike in the PC world, we have a unique opportunity to improve the security of mobile applications thanks to the centralized nature of app distribution through popular app markets. Thorough validation of apps, applied as part of the app market admission process, has the potential to significantly enhance mobile device security. I believe that automated offline validation of smartphone apps at the app-market level is a promising approach for drastically improving the security of smartphones. AppInspector, a project that I am working on, is such a system that analyzes apps and generates reports of potential security and privacy violations.

AppInspector first installs and loads the app on a virtual smartphone. An input generator running on the host PC then injects user interface events and sensor input. The smartphone application runtime is augmented with an execution explorer that aids in traversing possible execution paths of the app. While the app runs, an information-flow and action tracking component monitors privacy-sensitive information flows and generates logs. Finally, AppInspector provides security analysis tools which can be used after execution completes to interpret the logs and generate a report. I plan to continue this project, build the system, and deploy it in app markets or provide it as a third-party service running in cloud computing infrastructure.

[GCC+11] P. Gilbert, B.-G. Chun, L. P. Cox, and J. Jung. Automated Security Validation of Mobile Apps at App Markets. 2nd International Workshop on Mobile Cloud Computing and Services (MCS 2011), June 2011.

Written by bgchun

August 17, 2013 at 1:47 pm

%d bloggers like this: