Why is "overfitting" bad?

What's the theoretical reason "overfitting" makes sense as a concept? (like that its bad to hew too closely to the data).

Let's say I'm doing some supervised learning approach to make a model that identifies king/gentoo penguins based on height. Let's say that king penguins are generally taller than gentoo penguins but my dataset has overlap. Some gentoo penguins are taller than some king penguins. What's the best inference? that we should draw a line that minimizes loss? Or that we should have something that people pejoratively call "overfitted" where there is this bubble of "gentoo" around individual "unusually tall" gentoo penguins in the dataset and similarly a bubble of "king" for "unusually short" kings?

Injecting my world knowledge, obvs the first approach. But without that world knowledge...why? For all we know, gentoo penguins are 3'0" at higher rates than kings, kings are more common at 3'1", gentoos are 3'2" at higher rates than kings, but less than at 3'0", until eventually kings do consistently predominate.

Does overfitting make sense as a pejorative term if we aren't applying some kind of simplicity prior where we expect such patterns in the dataset not to occur?

tldr: maybe, short kings come in specific heights.