Here are some suggestions of projects from us. If you have another idea related to programming language research, and especially, targetting our main language of interests, i.e. Javascript, R, Julia or Racket, we encourage you to contact us.
Level: PhD
Data analysis is typically performed by composing a series of discrete tools and libraries into a data analysis pipeline. These pipelines are at the core of data-driven science that has been central to most disciplines and today see an explosion in the widespread use of computational methods and available data. As the number of tools and size of data keep growing, we face problems with the scalability of the pipelines and the trustworthiness of their results.
The goal of this work is to research ways to make data analysis pipelines scalable (accommodate growing data and computational needs) and trustworthy (facilitate auditing of the analysis result). The research will go along two axes. The first will focus on extending the R programming language with transparent horizontal and vertical scaling. The second will study a combination of static and dynamic program analysis techniques to gain insight into the nature and severity of programming errors in the code of data-analysis pipelines, and propose algorithms for their detection and possible automated repair.
with Konrad Siek
Working with runtimes, be it hacking on the runtime internals themselves, or their associated byte code interpreters, often requires peering into the machine and figuring out what its state is (and why its going boing at any given time). Usually this is done with a debugger and developers accumulate useful expressions and functions to look at the state of a particular runtime structure. These structures can be convoluted, unintuitive, and definitely not designed to be user readable. So this is messy and tedious. The idea for this project is to create a framework for visualizing the internals of virtual machines: the heap, operand and frame stacks, registers, and the code vector. The framework would show the current state of the runtime in as it is debugged in a clear and human readable fashion. The first runtime we would like to do this for is the GNU R interpreter and the Ř bytecode developed in our lab, but the goal is to implement the framework robustly and allow for extensions to eventually make it work with other runtimes.
Interests: Rust, C, C++
with Filip Křikava
Implement support for Debug Adapter Protocol for the R programming language. Test in VS code and vim (neovim).
Interests: R, Typescript, Vim
Links:
with Filip Křikava
Implement a language server protocol for the Simple Object Machine (a small dialect of Smalltalk) programming language.
Interests: Programming languages, runtime systems
Links:
with Filip Křikava
Implement a debug adapter protocol for the Simple Object Machine (a small dialect of Smalltalk) programming language.
Interests: Programming languages, runtime systems
Links:
with Filip Křikava
Implement a support for debugging for the Haskell plugin for Intellij IDEA.
Interests: Haskell, Java/Scala
Links:
with Pierre Donat-Bouillud
Visualizations have been a useful tool to help teaching algorithms. This goal of the project is to create interactive visualizations of the program analysis algorithms presented in the NI-APR course, using a library D3.js. The course tackles Static Analysis, Symbolic Execution, and Dynamic Analysis.
It should be possible to navigate the state of an analysis depending on where it is in the analyzed program. In the case of type analysis, for instance, each statement or expression in the program could show the type constraints it adds. For symbolic execution, the statements to execute based on the worklist, the symbolic execution tree built so far, can be shown.
Interests: program analysis
Links:
Design and implement an interpreter for a functional, lazy programming language with high-level vectorized operations. The interpreter will work on a multi-level stack-based intermediate representation called RIR.
Interests: low-level C++ programming
Links:
Study how to improve the performance of the code of a language implementation, such as an interpreter core loop, with the use of a stochastic optimizer. A stochastic optimizer, such as STOKE, uses random search to generate extremely efficient versions of short code snippets.
Interests: assembly language, language implementation
Links:
Engineer a framework for running regression benchmarks, and visualize the results on each commit of a system. Track performance and memory footprint, as well as other health indicators.
Interests: Scripting, System programming, JavaScript