About Me
Hello, I'm Nathan Sobotka, a student at the University of Pennsylvania. I study computer science and I'm most interested in low-level areas, including computer architecture and compiler verification.
I spent this summer working with Professor Joe Devietti on a project called Robust Profile Guided Runtime Prefetch Generation (RPG2).
The project worked on dynamically inserting prefetch statements into C and C++ binaries to increase performance. RPG2 achieves of speedup of up to 2.15x. Because RPG2 is dynamic, it is capable of tuning the prefetch
distance while the code is running, and it can turn itself off when prefetching hurts performance!
Additionally, I was an intern at the NASA Langley Research Center. There we worked on combining static analysis with property based testing.
Two common property-based testing techniques, randomized testing and enumeration, both have drawbacks. Randomized testing is unlikely to find rare errors,
while enumeration techniques are unlikely to find large inputs that cause bugs. Because of this, we created a system that uses static analysis to find
the constructor patterns in Haskell code, and use those patterns to inform a combinatorial testing suite. With this extra guidance, combinatorial testing
our testing is able to find bugs other testing methodologies cannot.
Education
- 2023-2024 Graduate Education:
MSE in Computer Science from the School of Engineering and Applied Science at the University of Pennsylvania
- 2020-2024 Undergraduate Education:
BSE in Computer Science from the School of Engineering and Applied Science at the University of Pennsylvania
- 2016-2020 High School:
Saint Paul Academy and Summit School
Work
- 2023 Robust Profile Guided Runtime Prefetch Generation (RPG2) with Professor Joe Devietti.
- 2023 Combinatorial and Property Based Testing at the NASA Langley Research Center
- 2023-2024 TA for Computer Organization and Design (CIS 5710)
- 2023 TA for Introduction to Computer Systems (CIS 2400)
- 2022 Compiler verification research with The Vellvm Project and Professor Steve Zdancewic
- 2022 Tutored Mathematical Foundations of Computer Science (CIS 1600) and Linear Algebra (Math 3120)
- 2016-2020 Coached tennis at Fred Wells Tennis and Education Center
- 2018 Volunteered at the Veterans Hospital in Minnesota, shadowing cardiologist Stefan Bertog
Projects
- RPG2: I worked on the Robust Profile Guided Runtime Prefetch Generation (RPG2) under Professor Joe Devietti. In this project we dynamically inserted prefetch statements into binaries. Inserting prefetch statements is an extremely difficult task, that varies widely based on program, input, and machine. Therefore, dynamic insertion has a few advantages over static, including the ability to tune the prefetch distance and stop prefetching should it hurt performance. I worked on testing speedup, IPC, MPKI, dynamic instruction count, and more, all across 7 different benchmarks while comparing the results to past work. Additionally, I worked on the search algorithm RPG2 uses to find the optimal prefetch distance.
- NASA: As a NASA intern in combinatorial and property based testing, one partner and I reviewed multiple academic academic papers in the field of automated testing in function programming languages. Based on our findings, we identified an area that automated testing can sometimes fail to identify, that is, programs with large and rare inputs that cause bugs. Given this information, we developed a strategy for automated testing to identify these bugs: use the function constructors to inform combinatorial testing. Because any large and rare bug is likely to be due to a specific case of a function, and therefore written into the code explicitly, our testing methology will be able to identify these bugs.
- The Vellvm Project. I was part of the team working on the memory model of the Vellvm Project, where I worked under Professor Steve Zdancewic. The project is available at https://github.com/vellvm. I worked on testing the memory model by writing unit tests in LLVM, automated tests in QuickChick, and writing proofs using the Coq Proof Assistant. Additionally, I worked on a creating public Monad library for Coq, where we defined and proved the monad laws and other fundamental theorems about the list, state, error, CPS, and free monads, among others.
- In CIS 5550 I developed my own search engine completely from scratch in Java. This included implementing my own webserver, key value store, and my own version of Apache spark to maximize parallelizability. On top of these foundational programs, this project included building a crawler complete with optimizations to avoid spider traps and crawl a healthy variety of pages, an indexer to compute pagerank and tf / idf values, and a frontend to display the results. The search engine was hosted on Amazon AWS, and had over 125 thousand results that could be displayed to users.
- In CIS 3800 I built an operating system in C. My portion of the project included creating the scheduler, complete with priority scheduling, blocking and sleeping, and context switching to optimize CPU utilization. I also integrated it with a custom built FAT file system and my own shell (with pipelining and background processes), complete with approximately 15-20 runnable commands.
- In CIS 4710 I implemented and tested a two way five stage superscalar pipelined processor in Verilog. It was tested using a benchmark of 1.8 million tests, and was fully functional. The processor also had all commands, including branching and memory operations. I am now a TA for the course.
- In CIS 4600: Interactive Computer Graphics I created a playable version of Minecraft in C++. The game included functional player physics, multithreading and efficient terrain generation, as well as infinite terrain generation. It also includes many aesthetic features, such as shadow mapping, block animation, and Turing-complete redstone.
- Waffler. The Waffler is one of the many Wordle spinoffs, where every board is solvable in 10 moves. This project is an optimal Waffle solver, capable of determining the optimal solution to every daily Waffle. It also includes the ability to play unlimited boards, restart games, and try out the official daily waffle without fear of failure. Visit the Waffle Game to see and play the official game. Credit to @thatwafflegame for the game concept.
For an indepth look at my projects, please reach out to me at nsobotka@seas.upenn.edu. I may be permitted to share some of my private repositories with recruiters.
Coursework
- CIS 6010: Advanced Topics in Computer Architecture
- CIS 5590: Programming and Problem Solving
- CIS 5550: Internet and Web Systems
- CIS 5470: Software Analysis
- CIS 5230: Ethical Algorithm Design
- CIS 5210: Artificial Intelligence
- CIS 5110: Theory of Computation
- CIS 4710: Computer Organization and Design
- CIS 4600: Interactive Computer Graphics
- CIS 4500: Database and Information Systems
- CIS 3990: Human Computer Interaction
- CIS 3800: Computer Operating Systems
- CIS 3200: Introduction to Algorithms
- CIS 2620: Automata, Computability, and Complexity
- CIS 2400: Introduction to Computer Systems
- CIS 1920: Python Programming
- CIS 1600: Mathematical Foundations of Computer Science
- CIS 1210: Programming Languages and Techniques II
- CIS 1200: Programming Languages and Techniques I
- ESE 3010: Engineering Probability
- MATH 3120: Linear Algebra
- MATH 2410: Calculus, Part IV (Partial Differential Equations)
- MATH 2400: Calculus, Part III (Linear Algebra and Ordinary Differential Equations)
- MATH 1410: Calculus, Part II (Multivariable Calculus) *Placed out
More information can be found in the course catalog.
Contact information
My email address is nsobotka@seas.upenn.edu.