<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
  <front>
    <journal-meta>
      <journal-id journal-id-type="nlm-ta">REA Press</journal-id>
      <journal-id journal-id-type="publisher-id">Null</journal-id>
      <journal-title>REA Press</journal-title><issn pub-type="ppub">3042-0180</issn><issn pub-type="epub">3042-0180</issn><publisher>
      	<publisher-name>REA Press</publisher-name>
      </publisher>
    </journal-meta>
    <article-meta>
      <article-id pub-id-type="doi">https://doi.org/10.22105/scfa.v1i2.35</article-id>
      <article-categories>
        <subj-group subj-group-type="heading">
          <subject>Research Article</subject>
        </subj-group>
        <subj-group><subject>Deep learning, Support vector machine, Random forest, Artificial neural network, Convolutional neural network, Mel-frequency cepstrum coefficients, Librosa. </subject></subj-group>
      </article-categories>
      <title-group>
        <article-title>Deep Audio Classifier: An Artificial Neural Network Approach</article-title><subtitle>Deep Audio Classifier: An Artificial Neural Network Approach</subtitle></title-group>
      <contrib-group><contrib contrib-type="author">
	<name name-style="western">
	<surname>Yadav</surname>
		<given-names>Abhishek </given-names>
	</name>
	<aff>Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar, Odisha, India.</aff>
	</contrib><contrib contrib-type="author">
	<name name-style="western">
	<surname>Raj</surname>
		<given-names>Abhishek </given-names>
	</name>
	<aff>Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar, Odisha, India.</aff>
	</contrib><contrib contrib-type="author">
	<name name-style="western">
	<surname>Anand</surname>
		<given-names>Sankalp </given-names>
	</name>
	<aff>Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar, Odisha, India.</aff>
	</contrib><contrib contrib-type="author">
	<name name-style="western">
	<surname>Kumar</surname>
		<given-names>Vineet </given-names>
	</name>
	<aff>Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar, Odisha, India.</aff>
	</contrib><contrib contrib-type="author">
	<name name-style="western">
	<surname>Kumar</surname>
		<given-names>Abhay </given-names>
	</name>
	<aff>Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar, Odisha, India.</aff>
	</contrib></contrib-group>		
      <pub-date pub-type="ppub">
        <month>06</month>
        <year>2024</year>
      </pub-date>
      <pub-date pub-type="epub">
        <day>07</day>
        <month>06</month>
        <year>2024</year>
      </pub-date>
      <volume>1</volume>
      <issue>2</issue>
      <permissions>
        <copyright-statement>© 2024 REA Press</copyright-statement>
        <copyright-year>2024</copyright-year>
        <license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/2.5/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.</p></license>
      </permissions>
      <related-article related-article-type="companion" vol="2" page="e235" id="RA1" ext-link-type="pmc">
			<article-title>Deep Audio Classifier: An Artificial Neural Network Approach</article-title>
      </related-article>
	  <abstract abstract-type="toc">
		<p>
			This research centers on developing a deep audio classifier by examining several machine learning and deep learning algorithms, such as Support Vector Machines (SVMs), Random Forest (RF), Artificial Neural Networks (ANNs), and Convolutional Neural Networks (CNNs). The models were trained and evaluated using the UrbanSound8K dataset. The objective of this study is to create strong models that can effectively classify intricate urban sound environments. The audio samples went through comprehensive preprocessing steps, including noise reduction, normalization, and trimming to maintain consistent sample duration. Feature extraction was conducted using Mel-Frequency Cepstral Coefficients (MFCCs). The ANN model, which consists of dense layers tailored for feature learning and utilizes softmax activation for multi-class classification, obtained a classification accuracy of 80.20%. The SVM and RF models achieved accuracies of 82.34% and 84.90%, respectively, using linear and ensemble methodologies. The CNN model surpassed the others with an accuracy of 88.45%, showcasing its ability to capture spatial hierarchies and localized patterns within audio data. Model performance differed by class, demonstrating high precision in recognizing specific sounds such as car horns and gunshots. The research ends with recommendations for future efforts, such as utilizing sophisticated data augmentation methods, investigating hybrid models, and conducting more extensive hyperparameter tuning to enhance classification accuracy and adaptability in practical urban settings.
		</p>
		</abstract>
    </article-meta>
  </front>
  <body></body>
  <back>
    <ack>
      <p>Null</p>
    </ack>
  </back>
</article>