Machine listening is the auditory sibling of computer vision and aims to extract meaningful information from audio signals. It has the potential to provide valuable information for numerous tasks including understanding and improving the health of our cities (e.g., monitoring and mitigating noise pollution), natural environments (e.g., monitoring and conserving biodiversity), and more. In our research, we investigate problems all along the machine listening pipeline, from both human and technical perspectives, with the aim of building better tools for understanding sound at scale.