Method Article
* These authors contributed equally
This protocol presents PIPEMAT-RS, a standardized MATLAB-based preprocessing pipeline for resting-state EEG data. It ensures artifact removal, improves signal quality, and enhances data reproducibility across studies. The pipeline automates key preprocessing steps, including filtering, independent component analysis (ICA), and artifact classification, facilitating consistent and reliable EEG analysis for neurophysiological research.
Electroencephalography (EEG) is a crucial tool in neuroscience research and clinical applications, but raw EEG data often contain noise and artifacts that compromise signal quality. To address this, we developed PIPEMAT-RS, a standardized MATLAB-based preprocessing pipeline for resting-state EEG data. PIPEMAT-RS follows a structured seven-step workflow: file format conversion, EEG montage configuration, downsampling, filtering, artifact rejection, independent component analysis (ICA), and ICLabel classification for automated artifact removal. This protocol enhances EEG data quality by minimizing human intervention while maintaining high accuracy in artifact rejection. It was validated using multiple datasets, demonstrating its robustness in improving signal integrity. PIPEMAT-RS provides a systematic approach that facilitates reproducibility and reliability in EEG studies, aligning with commonly adopted practices in the field and offering a clearly documented structure that can complement existing pipelines. By standardizing EEG preprocessing, PIPEMAT-RS facilitates neurophysiological research and clinical applications, allowing for more accurate interpretations of resting-state brain activity and its associations with neurological and psychiatric conditions.
The electroencephalogram (EEG) is a critical tool in both neuroscience research and clinical practice, offering valuable insights into brain electrical activity. Resting-state EEG data, in particular, are extensively utilized to investigate various neurological and psychiatric conditions, such as stroke, fibromyalgia, and chronic neuropathic pain. Despite its widespread application, the analysis of EEG data necessitates rigorous preprocessing to eliminate artifacts and noise, ensuring the integrity and reliability of the results1,2,3. Over the years, EEG has been invaluable in providing non-invasive and high-temporal resolution measurements of brain activity, making it indispensable for studying the dynamic processes of the brain4,5. However, the raw EEG signals are often contaminated by a variety of noise sources that can obscure the neural signals of interest, thereby complicating the interpretation of the data6.
EEG signals are highly susceptible to contamination from sources such as muscle activity, eye movements, and electrical interference7. These artifacts can obscure neural signals, making preprocessing a critical step in EEG analysis. This process typically involves multiple stages-file format conversion, downsampling, filtering, artifact rejection, independent component analysis (ICA), and exclusion of noise-related components8,9. Each step contributes to improving signal quality and ensuring that the data accurately reflect neural activity10. The complexity of these processes necessitates the use of sophisticated algorithms and software tools to ensure that the preprocessing is both efficient and effective11.
One of the primary challenges in EEG data preprocessing is the variability in the data, which can differ significantly between subjects and recording sessions10,12. This variability arises due to differences in the physiological and anatomical characteristics of the subjects, as well as variations in the experimental conditions and equipment used. Additionally, the lack of standardized preprocessing protocols can lead to inconsistencies in data analysis and interpretation13,14. While several pipelines and preprocessing scripts are available, many of them are either tailored to specific datasets or lack comprehensive documentation, making them less accessible to the broader research community14. Platforms like GitHub host numerous EEG preprocessing scripts and plugins, facilitating the sharing and collaborative improvement of these tools15. However, the fragmented nature of these resources underscores the need for a robust, efficient, and standardized editable preprocessing pipeline that can be broadly applied across different datasets and research contexts16. Such a pipeline would not only enhance the reproducibility of EEG studies but also provide a foundation upon which the research community can build and adapt for specific needs17.
In addition to EEGLAB, which forms the basis of PIPEMAT-RS due to its extensive user base and integration with MATLAB, other EEG processing toolboxes are also widely used in the field. For instance, MNE-Python is an open-source Python-based platform that offers advanced functionalities for source localization, sensor-space analyses, and integration with machine learning workflows. FieldTrip, a MATLAB-based toolbox like EEGLAB, is known for its flexible scripting environment and comprehensive support for time-frequency and connectivity analyses. While these platforms provide powerful alternatives, they typically require steeper learning curves and more advanced programming skills. PIPEMAT-RS was designed to bridge accessibility and transparency, especially for researchers seeking an editable and modular solution for resting-state EEG preprocessing using EEGLAB. Nonetheless, the conceptual structure of PIPEMAT-RS can be adapted to other toolboxes as needed, fostering interoperability and customization across different software environments.
Several automated EEG preprocessing pipelines have been developed in recent years, such as RELAX18, Automagic19, APP20, and PREP7. These pipelines incorporate advanced algorithms for artifact removal and data cleaning, often employing ICA-based strategies with limited transparency or customization for non-expert users. For instance, RELAX introduces a multi-step ICA cleaning method that improves artifact rejection but relies on complex configurations that may not be easily modifiable21. In contrast, PIPEMAT-RS is designed to provide a flexible, transparent, and educationally accessible alternative, enabling researchers to understand and manually adjust each preprocessing step. This transparency allows PIPEMAT-RS to be used as both a research and teaching tool while maintaining rigor and reproducibility. Moreover, PIPEMAT-RS emphasizes documentation, modular execution, and adaptability across datasets-features that are often underreported or absent in other pipelines.
Economically, the development and implementation of a standardized editable preprocessing pipeline for EEG data can have significant impacts. The global EEG devices market was valued at approximately USD 1.2 billion in 2020 and is expected to grow at a compound annual growth rate (CAGR) of 7.5% from 2021 to 202822. Efficient preprocessing pipelines, particularly those that are editable and modular, can reduce the time and cost associated with data analysis-an important consideration in large-scale clinical trials and public health research. By minimizing human error and automating repetitive tasks, such pipelines can improve the accuracy of EEG data analysis and lead to more reliable research outcomes13. This, in turn, can accelerate the development of new diagnostic tools and therapeutic interventions, potentially reducing healthcare costs and improving patient outcomes1.
This paper aims to present and validate a standardized editable Preprocessing Integrated Pipeline for Resting-State EEG using MATLAB (PIPEMAT-RS). Designed to be both robust and efficient, this editable pipeline is versatile and can be widely applied to various datasets and research scenarios. The seven comprehensive steps include file format conversion, downsampling, filtering, artifact rejection, independent component analysis (ICA), and noise related components exclusion, all integrated into a cohesive workflow23,24. The development of PIPEMAT-RS was guided by the need to address common challenges in EEG preprocessing and to provide a tool that researchers can use to streamline their data analysis processes25. By integrating with commonly adopted practices in the field and leveraging the capabilities of MATLAB, PIPEMAT-RS aims to offer a reliable, editable, and user-friendly framework to support EEG data preprocessing26.
To validate the effectiveness of PIPEMAT-RS, we have applied it in several studies. In one study, PIPEMAT-RS was utilized to investigate EEG biomarkers in stroke patients, identifying significant correlations between maladaptive motor function and depressive profiles27. Another study applied PIPEMAT-RS to analyze neural signatures of brain compensation in stroke patients using EEG and TMS, revealing critical insights into the brain's adaptive mechanisms post-stroke28. Further validation comes from a study that explored the neural adaptations and compensations in fibromyalgia patients29. Additionally, another study demonstrated how delta and theta bands in resting-state EEG serve as compensatory mechanisms in chronic neuropathic pain30.
These studies collectively highlight PIPEMAT-RS's utility in processing and analyzing EEG data, establishing its efficacy across various conditions and applications. By providing a standardized approach, this pipeline not only enhances the reproducibility of EEG studies but also facilitates the broader adoption of consistent preprocessing methods in the research community. Through this paper, we aim to contribute a valuable resource to the field, promoting more accurate and reliable EEG data analysis16,17. In doing so, we hope to support the advancement of research in neurophysiology and the development of novel therapeutic approaches for neurological and psychiatric conditions.
The foundational preprocessing steps used in PIPEMAT-RS are rooted in robust, widely validated methods that have become essential in EEG research for ensuring data quality and reproducibility2,8,14. Although these methodologies are well-established, they remain crucial due to their effectiveness in addressing common EEG artifacts and noise, which continues to support reliable analyses across studies. However, few studies have recently provided a fully documented, standardized pipeline tailored for resting-state EEG. By integrating these proven techniques with recent advances, such as ICLabel for automated artifact classification, PIPEMAT-RS delivers a contemporary, structured approach that enhances replicability and accessibility for researchers, fulfilling an ongoing need for rigorously documented preprocessing workflows in the field.
While PIPEMAT-RS pipeline follows established preprocessing steps commonly employed in EEG studies, its documentation in the form of a scientific paper fills an important gap in the literature. Many EEG studies rely on similar preprocessing methods but often lack a comprehensive, step-by-step description that allows for full reproducibility and ease of adoption by other researchers. By detailing each component of PIPEMAT-RS and presenting a standardized sequence for resting-state EEG analysis, this work aims to enhance transparency, minimize user variability, and provide a replicable framework applicable across studies. Rather than introducing novel methods, this contribution is intended to serve as an instructional and practical guide that helps standardize preprocessing practices and improve accessibility for a wide range of EEG researchers, particularly those working with resting-state data.
The preprocessing of EEG data is a critical step that significantly influences the quality and reliability of subsequent analyses. The Preprocessing Integrated Pipeline for Resting-State EEG using MATLAB (PIPEMAT-RS) was developed to address common challenges in EEG data preprocessing by providing a comprehensive, standardized approach applicable to various datasets. This structured, editable pipeline consists of seven key steps: i) File Format Conversion; ii) EEG Montage; iii) Downsampling and Filtering; iv) Artifact Rejection and Rereferencing; v) Independent Component Analysis (ICA); vi) ICLabel Classification; vii) Data Normalization. Each step is meticulously designed to enhance signal quality and facilitate the extraction of meaningful information from raw EEG data.
A critical feature of PIPEMAT-RS is that each preprocessing step concludes with saving the dataset under a distinct filename. This structure ensures that, for each processed file, a derivative version corresponding to each specific step is generated. This systematic approach allows users to access data at any point within the seven-step preprocessing pipeline. The saved files include identifiers that clearly indicate the stage of preprocessing, making it easier to track and manage data throughout the workflow.
PIPEMAT-RS begins with file format conversion, where raw EEG data files are transformed into a MATLAB-compatible format. This conversion is essential for the seamless handling of data in MATLAB's robust computational environment. Following this, the EEG montage step assigns precise spatial locations to each electrode on the scalp based on standardized electrode placement systems. This spatial information is crucial for the accurate interpretation of EEG data in neurophysiological studies.
Next, the pipeline applies downsampling and filtering to reduce the data's sampling rate and eliminate noise while retaining the frequency components relevant to neurophysiological research. Downsampling decreases the computational load and storage requirements without compromising data quality. Filtering removes specific frequency components known to be associated with artifacts, thereby preserving the essential neural signals.
Subsequently, artifact rejection and rereferencing are performed. This step identifies and removes segments of data contaminated by noise sources such as eye blinks, muscle activity, and electrode movement. Automatic artifact rejection algorithms are utilized to minimize manual intervention, reducing the potential for human error and increasing preprocessing efficiency. Rereferencing the data to the average of all electrodes helps to mitigate the influence of any single electrode and provides a stable reference for the EEG signals.
The Independent Component Analysis (ICA) step further refines the data by separating mixed signals into their independent sources. This decomposition facilitates the identification and removal of artifacts such as eye blinks, muscle artifacts, and line noise. ICA is a powerful technique that ensures the remaining data accurately reflects underlying neural activity.
Finally, ICLabel classification is applied to automatically identify and remove components classified as artifacts. ICLabel assigns probabilistic labels to independent components, categorizing them as brain activity, muscle activity, eye blinks, heartbeats, line noise, or channel noise. Components with a high probability of representing brain activity are retained, while those identified as artifacts are removed. This automated approach significantly reduces manual effort and ensures consistent and objective classification across datasets.
Each of these preprocessing steps is essential for improving the quality of EEG data. By standardizing the preprocessing workflow, PIPEMAT-RS minimizes variability introduced by different preprocessing methods, facilitating the comparison of results across studies. The pipeline's implementation in MATLAB, a widely-used platform in the neuroscience community, ensures accessibility and ease of integration into existing research workflows.
The development of PIPEMAT-RS was guided by commonly adopted practices in EEG data analysis, incorporating methods and techniques validated in previous research. Each step was carefully designed and rigorously tested to ensure its effectiveness in enhancing data quality (Figure 1). This comprehensive approach not only improves the reliability of EEG data but also supports the identification of neurophysiological markers critical for understanding various neurological and psychiatric conditions.
PIPEMAT-RS was developed as part of our ongoing research project, approved by the Ethics Committee of the Clinics Hospital, University of São Paulo Medical School (CAAE: 86832518.7.0000.0068). This project aims to enhance the quality and reliability of resting-state EEG data for various neurological and psychiatric studies. The data used in this study are derived from a cohort study, the protocol of which was previously published by Simis et al. (2021)31. The main study is still in progress, with data collection ongoing for the remaining clinical groups. The script for the pipeline is provided in Supplementary File 1.
1. File format conversion
2. EEG montage
3. Downsampling and filtering
4. Artifact rejection and rereferencing
5. Independent component analysis (ICA)
6. ICLabel classification
7. Data normalization
The validation of the EEG preprocessing pipeline is essential to ensure the reliability and effectiveness of the methods employed for extracting meaningful neurophysiological data. This process involves empirical evidence from published studies, statistical assessments, and comparisons with established benchmarks in the field.
The robustness and utility of PIPEMAT-RS have been demonstrated through its successful application in several published studies spanning a range of neurological and psychiatric conditions. For example, in the investigation of EEG biomarkers in stroke patients, significant correlations were found between maladaptive motor function, depressive profiles, and specific EEG markers, validating the effectiveness of the preprocessing steps in enhancing signal quality and facilitating meaningful data analysis32. Similarly, the analysis of neural signatures of brain compensation in stroke patients revealed critical insights into the brain's adaptive mechanisms post-stroke, demonstrating PIPEMAT-RS's capability to handle complex datasets and produce reliable results28. Further validation is evident in studies exploring neural adaptations and compensations in fibromyalgia patients, which identified distinct neural signatures associated with pain and compensation mechanisms, supporting the preprocessing methods in diverse clinical contexts29. Additionally, the investigation of compensatory mechanisms in chronic neuropathic pain highlighted the relevance of delta and theta bands in chronic pain states, further demonstrating PIPEMAT-RS's effectiveness in various neurophysiological conditions30.
Statistical validation of PIPEMAT-RS includes both quantitative and qualitative assessments to ensure the robustness of the preprocessing methods. Quantitative measures such as the signal-to-noise ratio (SNR), kurtosis, and skewness were used to objectively assess data quality. The SNR was calculated as the ratio between the power of the signal in the 1-50 Hz frequency range and the power outside this range (e.g., <1 Hz and >50 Hz), which typically contains non-neural noise. Increases in this ratio after preprocessing indicate enhanced preservation of neural information relative to noise. Similarly, kurtosis and skewness were assessed to evaluate the distribution characteristics of the EEG signal. Elevated kurtosis often reflects the presence of sharp transients or muscle artifacts, while high skewness may result from asymmetric signal distributions caused by noise or recording issues. Reductions in these values following preprocessing suggest improved signal regularity and reduced contamination by non-neural sources, supporting the overall enhancement of data quality. Preprocessing led to notable improvements in SNR, indicating that PIPEMAT-RS effectively reduces noise and enhances the neural signal. For example, in datasets with initially low SNR values (~5 dB), the preprocessing steps increased the SNR to approximately 7.5 dB, reflecting a clearer neural signal. Similarly, reductions in kurtosis (e.g., from 5.2 to 2.3) and skewness (e.g., from 1.5 to 0.8) further confirmed the attenuation of non-neural artifacts². Qualitative assessments involved visual inspection of the preprocessed data to ensure that artifacts were adequately removed and that the data accurately reflected underlying neural activity (Figure 2). This step allowed researchers to manually verify the effectiveness of the automatic preprocessing steps, ensuring no significant artifacts remained.
The inclusion of advanced techniques such as Independent Component Analysis (ICA) and ICLabel for artifact removal has been particularly validated in the literature. Studies have shown that ICA, when combined with automated classification algorithms like ICLabel, achieves classification accuracies that closely match expert-labeled components. ICLabel demonstrates an average classification accuracy of approximately 91%, reflecting strong agreement with human expert classifications and offering a standardized, scalable solution for large EEG datasets while minimizing inter-rater variability32.
Another aspect of validation is the comparison with established benchmarks in the field. The methods and results of our pipeline were aligned with standard practices and guidelines in EEG preprocessing, as recommended by leading researchers and institutions14. While PIPEMAT-RS does not claim superiority over existing pipelines, its modular structure, transparency, and ease of use aim to meet or complement the quality standards commonly accepted in the field (Table 1).
Figure 1: PIPEMAT-RS pipeline code structure. Overview of the MATLAB code implementing each step of the PIPEMAT-RS preprocessing pipeline. Please click here to view a larger version of this figure.
Figure 2: EEG signal quality before and after preprocessing. Comparison of (A) raw data, (B) manually cleaned data, and (C) data processed with PIPEMAT-RS, illustrating artifact reduction and improved signal clarity. Please click here to view a larger version of this figure.
Table 1: Comparison between manual and automated EEG preprocessing using PIPEMAT-RS. Please click here to download this Table.
Supplementary File 1: Complete Preprocessing Integrated Pipeline for Resting-State EEG using MATLAB (PIPEMAT-RS) script. This script encompasses all the steps detailed in the Protocol section, from file format conversion to data quality assurance. Please click here to download this File.
The PIPEMAT-RS pipeline was developed to provide a standardized, efficient method for preprocessing resting-state EEG data. Critical steps in this protocol include artifact rejection and Independent Component Analysis (ICA), both of which significantly enhance the signal-to-noise ratio and ensure the extraction of meaningful neural signals. The combination of automatic artifact rejection using the clean_rawdata function and manual inspection via eegplot ensures comprehensive artifact management, balancing efficiency and accuracy. The application of ICLabel for automated classification of independent components further refines the data, reducing manual workload while maintaining high classification accuracy (91%) consistent with human expert labeling32.
Modifications to the protocol may be necessary depending on the characteristics of the dataset. For example, while the pipeline is designed for resting-state EEG data, researchers can adapt filter settings and artifact rejection thresholds for task-based EEG recordings or datasets with different sampling rates. Troubleshooting steps include adjusting the flatline criterion and channel correlation thresholds in clean_rawdata if excessive channel removal occurs or if residual noise persists after automatic cleaning. Additionally, while ICA typically performs well with datasets of 32 or more electrodes, datasets with fewer electrodes may require manual fine-tuning or alternative artifact correction methods to achieve optimal results2.
Despite its strengths, PIPEMAT-RS has limitations. The effectiveness of ICA and ICLabel may vary based on the number of channels and the quality of the raw data. High levels of noise or poor electrode contact can reduce the accuracy of component separation and classification. Moreover, the pipeline is optimized for single-site studies and may require additional harmonization steps, such as ComBat, to minimize site-specific variability in multi-center research33. While PIPEMAT-RS enhances data quality through standardized preprocessing, the potential for user-induced variability remains if default settings are altered without careful validation.
Although ICA combined with ICLabel classification offers an efficient and automated solution for artifact rejection, it is important to acknowledge its limitations. The separation of neural and non-neural components is not perfect, and rejecting components classified as artifacts may still result in the unintended removal of neural signals, especially when sources are mixed22. While PIPEMAT-RS applies a conservative threshold (retaining components with a brain probability > 0.7) to reduce this risk, it does not eliminate it. This threshold follows standard practices to balance signal preservation and artifact removal, but future enhancements to the pipeline may incorporate more refined strategies -- such as dipole fitting, wavelet-enhanced ICA, or targeted ICA cleaning -- to further improve the specificity and accuracy of artifact rejection.
Compared to existing methods, PIPEMAT-RS offers a streamlined, standardized sequence of preprocessing steps that reduces variability between studies. Unlike flexible platforms like BEAPP, which allow users to customize preprocessing sequences, PIPEMAT-RS enforces a fixed structure, ensuring consistency across datasets34. This approach minimizes manual intervention, reduces human error, and ensures reproducibility, particularly in large-scale studies. Additionally, the integration of advanced tools like ICLabel and the focus on resting-state EEG data distinguish PIPEMAT-RS from other pipelines that may not prioritize artifact rejection and independent component analysis to the same extent.
The significance of PIPEMAT-RS lies in its ability to produce high-quality, reliable EEG data suitable for diverse applications in neuroscience and clinical research. The pipeline has been successfully applied to studies investigating neural markers of stroke recovery, chronic pain, and fibromyalgia, demonstrating its versatility and robustness27,28,29,30. By improving data quality and reducing preprocessing time, PIPEMAT-RS facilitates large-scale studies, contributes to the identification of neurophysiological biomarkers, and supports advancements in personalized medicine. Its standardized approach ensures that results are comparable across studies, enhancing the reproducibility and reliability of EEG research in both clinical and academic settings.
In conclusion, PIPEMAT-RS provides a robust, standardized solution for preprocessing resting-state EEG data, addressing common challenges related to artifact rejection, signal clarity, and data consistency. By integrating automated techniques such as ICA and ICLabel with manual verification steps, the pipeline ensures high-quality, reproducible data suitable for a wide range of neurological and psychiatric research applications. Its fixed sequence of preprocessing steps minimizes user-induced variability, facilitating consistent results across studies. While the pipeline demonstrates strong performance in various clinical contexts, including stroke and chronic pain research, future work should focus on validating its applicability across diverse experimental paradigms and multi-site studies. Overall, PIPEMAT-RS offers a structured and well-documented alternative for EEG preprocessing, designed to enhance data quality, reproducibility, and accessibility across studies.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Specifically, LMM was supported by a postdoctoral research grant #21/05897-5, São Paulo Research Foundation (FAPESP). AC was supported by a scientific initiation grant #21/12790-2, São Paulo Research Foundation (FAPESP). SPB was supported by a postdoctoral research grant #20/08512-4, São Paulo Research Foundation (FAPESP). FF and LRB are supported by research grant #17/12943-8, São Paulo Research Foundation (FAPESP). Individually, FF received support from NHI 2020 R01 AT, Project #1R01AT009491-01A1.
Name | Company | Catalog Number | Comments |
MATLAB | MathWorks | R2020a or newer | Required for executing the PIPEMAT-RS script. |
EEG Recording System | Brain Products | https://www.brainproducts.com/ | EEG acquisition system used for data collection. Other brands include BioSemi / ANT Neuro (or equivalent) |
EEGLAB Toolbox | Swartz Center for Computational Neuroscience | N/A | Open-source MATLAB toolbox for EEG analysis. |
Electrodes (Ag/AgCl) | Brain Products | https://www.brainproducts.com/solutions/r-net/ | Used in EEG data acquisition. Other brands include BioSemi / ANT Neuro (or equivalent) |
ICLabel Plugin | Swartz Center for Computational Neuroscience | N/A | Automated artifact classification tool for EEG. |
PIPEMAT-RS Script | N/A | N/A | Custom MATLAB script for standardized EEG preprocessing. |
Signal Amplifier | Brain Products | https://www.brainproducts.com/solutions/#amplifiers | Amplifies EEG signals for processing. Other brands include BioSemi / ANT Neuro (or equivalent) |
Standard 64-Channel EEG Cap | Brain Products | https://www.brainproducts.com/solutions/#electrodes-caps | Electrode cap for EEG recording. Other brands include BioSemi / ANT Neuro (or equivalent) |
Request permission to reuse the text or figures of this JoVE article
Request PermissionThis article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. All rights reserved