A Dataset of Regional Earthquake Waveforms
We present an extensive quality-controlled dataset of waveforms of earthquakes recorded at regional distances. These waveforms are 5 minutes long and contain arrivals for the P, Pg, Pn, S, Sn and Sg phases, as well as event and station metadata. Each one of the examples in the dataset is required to have at least one of {P, Pg, Pn} arrivals and at least one of {S, Sg, Sn} arrivals. Arrivals in the dataset are recorded at a source-receiver distance between 1 and 20 degrees in three component instruments. After initially collecting over 3 million waveforms, we quality controlled the data using an ensemble of Machine Learning Models. First, we trained a Recurrent Neural Network that distinguishes between earthquake signals and synthetic noise. This model allows us to flag examples in the dataset for which there are labeled arrivals, but the waveforms do not show any distinguishable earthquake signal. On the other hand, given that 5 minutes is a long window, and many earthquakes can be recorded in such time, we used a fine-tuned version of our RNN to flag those examples for which there are multiple earthquakes, because only one of them is labeled. We show preliminary ML models trained on the dataset for seismic phase picking.
Session: Opportunities and Challenges for Machine Learning Applications in Seismology [Poster]
Type: Poster
Room: Ballroom
Date: 4/19/2023
Presentation Time: 08:00 AM (local time)
Presenting Author: Albert L. Aguilar
Student Presenter: Yes
Additional Authors
Albert Aguilar Presenting Author Corresponding Author aguilars@stanford.edu Stanford University |
Gregory Beroza beroza@stanford.edu Stanford University |
|
|
|
|
|
|
|
A Dataset of Regional Earthquake Waveforms
Category
Opportunities and Challenges for Machine Learning Applications in Seismology
Description