Day 7
The input looks like this:
$ cd /
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k
The problem is parsing this file structure and doing a sum per directory of file sizes, then outputting… well, actually, the problem is, errr… implementing a tree filesystem. Ugh.
This one was a bit more complicated. My first instinct was that I needed something like this:
[{'/': ['a', 14848514, 8504156, 'd']}
{'a': ['e', 29116, 2557, 62596]}
{'e': [584]}
{'d': [4060174, 8033020, 5626152, 7214296]}]
Where I could apply the following process:
- Sum all directories where all elements are integers (here, e and d)
- Substitute the names of the directories elsewhere for the sum
- Repeat untill all directories are a single integer
Not terrible, but what I actually needed what this:
[{'/': ['a', 14848514, 8504156, 'd'],
'a': ['e', 29116, 2557, 62596]
'e': [584],
'd': [4060174, 8033020, 5626152, 7214296]}]
Mostly because working with lists withing dictionaries within lists was a road to madness. Working like this is way cleaner. Also!! The input example doesn’t match the actual input! That was a lot of fun. So rather, since the same name can appear in different places, you want:
[{'/': ['a', 14848514, 8504156, 'd'],
'/a': ['e', 29116, 2557, 62596]
'/a/e': [584],
'/d': [4060174, 8033020, 5626152, 7214296]}]
After quite some effort I ended up doing this for part one (comments directly on the code):
import re
with open("input.txt") as file:
input = file.readlines()
input = [line.strip('\n') for line in input]
tree = []
pwd = "/"
for line in input:
if line.startswith("$ cd"):
ls = []
match = re.search("cd (.*)$", line)
cd_dir = match.group(1)
# store first ls
if cd_dir == "/":
# strip if dir
dir = re.search(r"dir (.*)", line)
file_size = re.search(r"\d+", line)
if dir:
ls.append(pwd+dir.group(1))
if file_size:
ls.append(int(file_size.group(0)))
tree.append({"/" : ls})
if cd_dir != "..":
if cd_dir != "/":
pwd = pwd+cd_dir+"/"
tree.append({pwd : ls})
if cd_dir == "..":
pwd = pwd.split("/")
pwd = [x for x in pwd if x != ""]
pwd = pwd[:-1]
pwd = "/"+"/".join(pwd)+"/"
pwd = pwd.replace("//","/")
if "$" not in line:
# get dir name if dir
dir = re.search(r"dir (.*)", line)
if dir:
ls.append(pwd+dir.group(1)+"/")
# get file sizes as int
file_size = re.search(r"\d+", line)
if file_size:
ls.append(int(file_size.group(0)))
# transform list of dictionaries into a megadictionary
# with all paths
flat_tree = {}
for d in tree:
flat_tree.update(d)
After getting the dictionary of dictionaries, it was a matter of replacing the values:
def sum_values(tree):
for key, values in tree.items():
# control for done sums
if isinstance(values, int):
continue
# if its a list of digits, sum
if all(isinstance(x, int) for x in values):
total = sum(values)
tree.update({key:total})
return tree
def replace_values(tree):
tree = sum_values(tree)
keys = list(tree.keys())
for key, values in tree.items():
if isinstance(values, list):
to_check = [x for x in values if isinstance(x, str)]
for directory in to_check:
if isinstance(tree[directory], int):
values[values.index(directory)] = tree[directory]
tree[key] = values
# recursive call:
all_values = list(tree.values())
for i in all_values:
if not isinstance(i, int):
replace_values(tree)
tree = sum_values(tree)
return tree
else:
return tree
# Run the above code, and filter
result_size = replace_values(flat_tree)
final = {k: v for k, v in result_size.items() if v < 100000}
print("Solution part 1:", sum(final.values()))
Part two requires that you calculate left space in the filesystem, get how much space you need for some update and pick the smallest folder that fulfills that criteria:
# part two
left_space = 70000000-result_size.get("/")
to_free = 30000000-left_space
final2 = {k: v for k, v in result_size.items() if v > to_free}
candidates = list(final2.values())
print("Solution part 2:min(candidates))
Way easier once you have implemented the filesystem! Pew! And that’s it!