GSoC 2024 at Chromium
Introduction
During this summer, I have been working as a contributor to the Chromium organization as a part of the Google Summer of Code (hereinafter referred to as GSoC), contributing to a project involving ChromiumOS, under the mentorship of two Google engineers.
Project description
Currently Chromebooks encrypt all user data safely, and they get “erased” by erasing the vault containing the encryption key, making the data unreadable. While this is definitely a way to achieve the end goal of making user data unintelligible after the user decided to erase their device, it is not compliant with the current state of the art guidelines on media sanitization defined by NIST. The goal of this project is to extend the erase method currently offered by ChromeOS, proposing a way to safely erase data from a Chromebook’s internal storage based on sanitize commands and compliant with the NIST 800-88 Guidelines for media sanitization.
Project proposal
The first thing to do when applying to the GSoC program is finding an organization whose projects and products feel exciting to you, and look into their project ideas. Once I found several project ideas that piqued my interest, I began doing some research work to submit a proposal. A key part of understanding how to approach the project was to become familiar with the NIST guidelines mentioned above, and what code paths would be changed during the project development.
For the latter, the Chromium (and ChromiumOS) project offers a great tool, Chromium Code Search, which enables people to browse the Chromium source tree in a very fast way without requiring a clone of the repositories, and has a very powerful search engine to find code snippets that could be impacted by what you’ll be working on.
Once you have a project proposal draft you’re happy with, it is heavily encouraged to reach out to the mentors listed for that specific project to get some feedback on how to improve it before the final submission.
My project proposal is available in this Google Document.
First milestone
The first milestone my mentors and I set together was to get an understanding of what needed to be done and a functioning demo of the project. This demo was entirely written in bash scripts and with a lot of hardcoded variables and hoops to jump through and preconditions to be met, to show the team an idea of what the erase flow should be and the feasibility of the project. Considering this project’s goal is to fully erase and sanitize the on-device physical storage, we needed a way to copy the operating system partition over to RAM so that the physical storage device wouldn’t be in use. This was achieved through an initramfs, a root filesystem that is embedded into our kernel and loaded at an early stage of the boot process, which allowed us to create a mirror of all the mission-critical read only partitions that need to be copied back on device after the disk would be sanitized. An initial solution we explored for this mirroring operation was utilizing RAID 1 arrays as a way to obtain a 1:1 copy of each target partition’s data, but this was soon replaced with a copy from our target partition to a temporary file that lives on RAM until the next restart. The rest of the erase flow for the project demo was implemented with a script that ran the storage controller’s sanitization commands and copied the data back. While the demo didn’t go that well initially, as I had never presented to a team before and the stage fright got to me to the point I accidentally initially demoed the wrong image which crashed, the team was very enthusiastic about the progress made on the project! The next step was a migration to C++ as a solution that relies exclusively on bash is not acceptable for plenty of reasons.
The next step
Even with the C++ migration of most of the code, the project still requires a bash-based initramfs to create an in-memory mirror of our partition containing ChromiumOS. Pivoting to this partition from our initramfs enables us to get access to the wide set of tools and APIs the ChromiumOS system has to offer, while also freeing our internal storage from any read-write operations that could interfere with the sanitization process.
The final version of the initramfs includes logic to detect whether an erase has been requested; depending on the outcome, we either proceed with the regular boot process or create our mirror, prepare it for the sanitization process and continue with the boot sequence that will trigger the rest of the erase flow.
The list of the CLs related to the initramfs is:
- initramfs: Add target for disk sanitization ramdisk to initramfs ebuilds
- kernel: initramfs: Disable DM_INIT for prod_ramfs
- initramfs: Added NIST Erase initramfs
- initramfs: Added root vars fetching to NIST erase initramfs
- initramfs: Add detection of NIST Erase request to initramfs
- initramfs: Add migration of rootfs to RAM to NIST Erase
- initramfs: Add support for DM verity rootfs to pivot
- initramfs: Remove hardcoded device path for DM verity rootfs
During our erase flow, once we execute the /sbin/init
included in ChromiumOS, execution gets handed over to chromeos_startup
; originally all of my changes were included inside this binary, but with the project’s estimated size and to make unit testing easier it was recommended to create another executable, which will be called by chromeos_startup
if the pre-conditions for sanitization are met.
This binary is responsible for fetching the partitions that we are going to copy temporarily in RAM, erasing the physical storage using the sanitization commands of the storage controller, restoring the data back and rebuilding the partition table to make sure the system is still bootable.
The list of the CLs related to the NIST Erase binary is:
- init: Add secure-wipe to chromeos-init dependencies
- init: Add rebuild_gpt.sh to /usr/sbin during build stage
- init: startup: Add check for NIST Erase request file
- init: startup: Add root variables fetching to NIST erase
- init: startup: Add partitions lookup to prepare for erase
- init: startup: Add data migration to NIST erase
- init: nist_erase: Add script to rebuild GPT
- init: startup: Add erase and restore flow
- init: startup: Add call to NIST erase flow
What have I learned?
First of all I’ve learned a lot about the culture at google; because of the project’s size and impact on the codebase I had the opportunity to talk to various people from all sorts of teams. With excitement, they have gone above and beyond to provide me feedback and pointers. Technically, this project has been quite challenging as this was my first exposure to a lot of the topics that this project spanned over. First of all I have learned how to navigate a very large codebase and how to find the right APIs and libraries to avoid reinventing the wheel when solving a task that’s already been tackled throughout the lifecycle of the codebase, then I built my own initramfs for the first time, got to enjoy the process of designing and implementing a solution from the ground up, learned a lot about the Gentoo + Portage ecosystem and using USE flags to make sure my code is included in an image only when I want it to be, and the build process of a project as big as an operating system. This was also my first experience with C++ besides small university projects and I’ve grown to love it a lot, and I will definitely consider it for my next projects!
If I had to pick the most important takeaway for my growth as a software engineer, it would definitely be the importance of code reviews, communication, and writing unit tests for my code to develop reliable software.
The biggest hurdle during these weeks was the fact that a lot of the tools that were needed to solve some of the problems I was experiencing are only available to Googlers, and sadly being a part of the GSoC program does not give you access to any of these resources. This issue has slowed development down a bit and made things harder overall, but thanks to my mentors’ support I was able to get through all of the issues.
Looking ahead
Even though the goals outlined in the project proposal have been met, this project still has a long way in front of itself, as we are just getting started! Some of the next changes could be the integration with the enterprise console to trigger device wipes remotely, UI elements and copy and localization. While I unfortunately had to cut the amount of weeks I could dedicate to the GSoC program due to academic reasons, I’d love to keep contributing to ChromiumOS and definitely will in the upcoming future. :-)
Thank yous and acknowledgements
A special thank you goes to my mentor Gwendal Grignou who has been helping me through every step of the project through weekly 1:1s and an endless number of Google Chat messages, besides also making sure I’d take things slow and not overwork myself and burn out, and to my other mentor Torsha Banerjee, who has provided lots of support throughout the summer. I’d also like to thank the Chromium org admins for GSoC Sreeja Kamishetty and Stephen Nusko who have been working hard to accommodate all of our requests and make this happen, and the entire team of the GSoC program admins for allowing us to participate in such a good program to strengthen our skills and widen our professional network. Thank you to Bastian, Sarthak and Jae who have been helping me with tackling a lot of issues and whose inboxes have always been open for any type of doubt I could have, even when outside of their main area of work.