How to Get a Raw Audio Stream From Agora.io iOS SDK
Agora.io is the most widely used real-time network in the world, and supports real-time video calls, voice chat, and interactive streaming, helping developers and businesses deliver richer in-app experiences. Agora offers a host of SDKs, including Android, Flutter, React Native, and iOS, to deliver high-quality, real-time interaction.
In this article, you will be building an Agora.io iOS app to get a raw audio stream from Agora using their iOS SDK. Then, with the help of the Agora raw data function, you can process audio data to add desired playback effects.
Why Choose Agora?
Offering real-time interaction in your application is a great way to increase user engagement, but building it on your own is a complicated and time-consuming process, and it takes you away from your core product. Thanks to Agora’s developer-first approach and their extensive offerings of APIs and SDKs, you can now integrate this functionality almost effortlessly.
Agora frees you from spending time worrying about the logistics of real-time interaction. Their intelligent global network provides services to users in over two hundred countries and can scale from one user to millions of concurrent users as your app and business grow. Their wide reach also ensures that no matter where your userbase is, interactions won’t lag or drop. With the largest global real-time network, 99.9 percent uptime, and around-the-clock customer support, it’s easy to see why it’s such a popular platform.
Use Cases for Agora SDKs
The robustness of Agora’s offerings means that the use cases are nearly infinite. Some popular use cases include:
- Livestreaming: Agora’s platform enables you to easily livestream even complex, multi-host events. If you’re looking for drama or theatrics, you’ll want to investigate their voice effects; if you’re trying to create an in-person feeling, their spatial audio functionality can make your virtual chats feel closer to in-person ones than ever before.
- Voice calls: Whether you want to allow players in a game to talk to each other or to offer live, on-demand voice support to enterprise customers at your business, Agora’s voice calling functionality has you covered.
- Education: Full-featured education offerings enable users to create custom classrooms. Thanks to interactive whiteboards, screen sharing, breakout rooms, and other features, you can build a classroom experience that’s more engaging and accessible than traditional online course delivery.
- Healthcare: As an increasing number of patients prefer telehealth services for many everyday ailments, remote healthcare services have become a larger part of many providers’ models. Agora is HIPAA-compliant and enables you to meet with patients and colleagues, as well as to offer or participate in ongoing education.
- Retail: From live shopping events to window shopping, shopping is more fun when it’s social. Agora lets you build a connection with your customers, whether that’s by showcasing a specific product or media personality, or by sharing behind-the-scenes looks at how their favorite products are created.
How to Get the Raw Audio Stream
To follow along with this tutorial, you’ll need the following:
- Working knowledge of iOS/Swift
- A free Agora.io account
- Xcode 12+
- Agora’s iOS SDK
At the end of this tutorial, you’ll have an iOS app that broadcasts audio to other consumers. If you’d like to see all the code in one place, you can easily fetch it from GitHub.
Building the Audio App
For the beginning part of this tutorial, you’re going to build the app. The steps for this can be divided into two parts. In the first part, you’ll create a project in Agora. For the second, you’ll create the app that will actually “use” the Agora-based project.
Creating the Project on Agora
To start with, log in to Agora and navigate to the Project Management page, underlined in red in the left-hand navigation. Click Create, which will also be underlined in red.
In the new project pop-up, you’ll need to name your project and specify a use case. If it’s not already selected, select the Testing Mode authentication mechanism, then click Submit.
Next, retrieve your private key by copying the app ID from the App Configuration page.
When you’re in a test environment, you need to name your channel and generate your token manually. In the “App certificate” section of the configuration page, click Generate temp RTC token.
After this, you’ll have a pop-up. Here enter a channel name, then click the Generate Temp Token button.
Be sure to write down all of these settings—you’ll need the app ID, channel name, and temporary token to create your app. Please note that while the app ID is also used in production mode, the manually generated channel name and temporary token are obsolete when you move out of testing.
Integrating the Agora Project
At this point, you’ve created the project on Agora. In this transitional step, you’ll now be integrating the Agora iOS SDK with your app.
There are two main ways to install dependencies for your iOS app, CocoaPods and Swift Package Manager. For this tutorial, you’ll use the Swift Package Manager.
Start by navigating to Package Dependencies in your Swift iOS app, then, on the next screen, click the + to add a dependency.
Next, you’ll search for “https://github.com/AgoraIO/AgoraRtcEngine_iOS” to find the AgoraRtcEngine_iOS. Once the package has been located, click Add Package.
For the AgoraRtcEngine to function correctly, it will need access to the phone’s camera and microphone. Whenever you need to request access to sensitive parts of the user’s phone, you need to provide an explanation with the request. This way the user understands what it will be used for, and why they need to grant access.
To do this, you’ll just need to put a simple description into your info.plist
under two keys: “Privacy – Camera Usage Description” and “Privacy – Microphone Usage Description”. For example, for “Privacy – Microphone Usage Description”, you might add something like “To enable voice calls, Agora needs access to your microphone.”
The last step needed to install the library is to write the code that requests access to the hardware on the user’s phone. You’ll do this inside a Swift class called PermissionManager
:
“`swift
import UIKit
import AVFoundation
class PermissionManager {
func requestCameraAccess() {
AVCaptureDevice.requestAccess(for: .video) { granted in
// Execute some code…
}
}
func requestMicrophoneAccess(){
AVAudioSession.sharedInstance().requestRecordPermission { granted in
// Execute some code…
}
}
}
“`
If you are not familiar with the above code, you can read more in the Apple docs for the microphone access or camera access.
Developing the Application
With Agora set up and the library installed, you’re ready to work on the application. To begin, start by creating a file named Config.swift
and setting up your Agora credentials.
“`swift
let kAppId: String = “8923kj32hkj3k23kj23788”
let kTempToken: String = “0068932jkkj3093298jk32jkq3UBi”
let kTempChannelName: String = “my_agora_test”
“`
Please note that the private key in this example is a random string. When you actually do this, the above keys should be added to your .gitignore
so your credentials aren’t exposed.
Next, you’ll need to append some basic views to the screen to provide a graphical interface for the app. You can view the finished file in the GitHub repo.
For this step of the tutorial, paste the following into ViewController.swift
:
“`swift
import UIKit
class ViewController: UIViewController {
lazy var startButton: UIButton = {
let button: UIButton = UIButton()
button.backgroundColor = .brown
button.setTitle(“Start Audio Streaming”, for: .normal)
button.translatesAutoresizingMaskIntoConstraints = false
button.addTarget(self, action: #selector(startEvent(_:)), for: .touchUpInside)
return button
}()
let permissionManager: PermissionManager = PermissionManager()
override func loadView() {
super.loadView()
view.addSubview(startButton)
startButton.leadingAnchor.constraint(equalTo: view.leadingAnchor, constant: 48).isActive = true
startButton.trailingAnchor.constraint(equalTo: view.trailingAnchor, constant: -48).isActive = true
startButton.heightAnchor.constraint(equalToConstant: 49).isActive = true
startButton.centerYAnchor.constraint(equalTo: view.centerYAnchor).isActive = true
startButton.centerXAnchor.constraint(equalTo: view.centerXAnchor).isActive = true
startButton.layer.cornerRadius = 8
}
@objc
func startEvent(_ sender: UIButton) {
}
override func viewDidLoad() {
super.viewDidLoad()
permissionManager.requestCameraAccess()
permissionManager.requestMicrophoneAccess()
}
}
“`
At this point, the UI for your app will look like this:
Instantiate the AgoraEngine
:
“`swift
// MARK: INSTANTIATE AGORA ENGINE
let config: AgoraRtcEngineConfig = AgoraRtcEngineConfig()
config.appId = kAppId
agoraKit = AgoraRtcEngineKit.sharedEngine(with: config, delegate: self)
“`
Your ViewController
needs to conform to the AgoraRtcEngineDelegate
:
“`swift
class ViewController: AgoraRtcEngineDelegate {
override func viewDidLoad() {
super.viewDidLoad()
// MARK: REQUEST NECESSARY PERMISSIONS
permissionManager.requestCameraAccess()
permissionManager.requestMicrophoneAccess()
// MARK: INSTANTIATE AGORA ENGINE
let config: AgoraRtcEngineConfig = AgoraRtcEngineConfig()
config.appId = kAppId
agoraKit = AgoraRtcEngineKit.sharedEngine(with: config, delegate: self)
}
}
“`
If you don’t set a client role, the default communication becomes one-to-one, like a private call. Setting the client role as a broadcaster allows one-to-many communication. You can do this with the following:
“`swift
// MARK: SET PROFILE
agoraKit?.setClientRole(.broadcaster)
“`
“`swift
override func viewDidLoad() {
super.viewDidLoad()
// MARK: REQUEST NECESSARY PERMISSIONS
permissionManager.requestCameraAccess()
permissionManager.requestMicrophoneAccess()
// MARK: INSTANTIATE AGORA ENGINE
let config: AgoraRtcEngineConfig = AgoraRtcEngineConfig()
config.appId = kAppId
agoraKit = AgoraRtcEngineKit.sharedEngine(with: config, delegate: self)
// MARK: SET ROLE
agoraKit?.setClientRole(.broadcaster)
}
“`
Now you’ll build a channel object to ensure that you can provide multiple channels, using the following:
“`swift
var channel: AgoraRtcChannel?
@objc
func startEvent(_ sender: UIButton) {
// MARK: INSTANTIATE AGORA CHANNEL
let mediaOptions: AgoraRtcChannelMediaOptions = AgoraRtcChannelMediaOptions()
// allow user to provide audio
mediaOptions.publishLocalAudio = true
channel = agoraKit?.createRtcChannel(kTempChannelName)
channel?.join(byToken: kTempToken, info: nil, uid: 0, options: mediaOptions)
}
“`
Controlling the Local Device’s Microphone with the Agora API
To complete your app, create a toggle for the microphone control, which will allow users to toggle the microphone on and off. By toggling agoraKit?.enableLocalAudio(true)
or agoraKit?.enableLocalAudio(false)
, you can open and close the microphone.
“`swift
var isSpeakerOn: Bool = true
lazy var speakerButton: UIButton = {
let button: UIButton = UIButton()
button.backgroundColor = .blue
button.setTitle(“Off”, for: .normal)
button.translatesAutoresizingMaskIntoConstraints = false
button.addTarget(self, action: #selector(speakerEvent(_:)), for: .touchUpInside)
return button
}()
@objc
func speakerEvent(_ sender: UIButton) {
isSpeakerOn = !isSpeakerOn
let title: String = isSpeakerOn ? “Off” : “On”
speakerButton.setTitle(title, for: .normal)
agoraKit?.enableLocalAudio(isSpeakerOn)
}
“`
Processing Audio Data
At this point, you have a functional audio app, but you may want to do more with the audio data being used in the app, such as capturing the audio or specifying an audio format.
To add the capability to do these things, there are only a few more small steps and code samples we need to include.
Capturing Audio
To retrieve the raw audio format, your ViewController
will need to conform to the AgoraAudioDataFrameProtocol
.
You’ll also need to call setAudioDataFrame()
and pass your class in:
“`swift
override func viewDidLoad() {
super.viewDidLoad()
// MARK: INSTANTIATE AGORA ENGINE
let config: AgoraRtcEngineConfig = AgoraRtcEngineConfig()
config.appId = kAppId
agoraKit = AgoraRtcEngineKit.sharedEngine(with: config, delegate: self)
// MARK: REQUEST NECESSARY PERMISSIONS
permissionManager.requestCameraAccess()
permissionManager.requestMicrophoneAccess()
// MARK: SET PROFILE
agoraKit?.setClientRole(.broadcaster)
// MARK: TO RETRIEVE AUDIO DATA
agoraKit?.setAudioDataFrame(self)
}
“`
The key part of the above code sample is this line:
“`swift
// MARK: TO RETRIEVE AUDIO DATA
agoraKit?.setAudioDataFrame(self)
“`
This sets up the raw audio frame delegate to allow us to capture audio.
This means you can implement the function getObservedAudioFramePosition
, which controls the audio observation positions:
“`swift
extension ViewController: AgoraAudioDataFrameProtocol {
// available positions: playback, mixed, record, beforeMixing
func getObservedAudioFramePosition() -> AgoraAudioFramePosition {
return .playback
}
}
“`
There are more functions that you can implement depending on what you need, including onRecordAudioFrame
, onPlaybackAudioFrame
, and onMixedAudioFrame
, which allow you to capture audio as it’s recorded, as it’s played back, or after audio from all participants has been combined.
An example of these functions in action would look like this:
“`swift
func onRecord(_ frame: AgoraAudioFrame) -> Bool {
return true
}
func onPlaybackAudioFrame(_ frame: AgoraAudioFrame) -> Bool {
return true
}
func onMixedAudioFrame(_ frame: AgoraAudioFrame) -> Bool {
return true
}
“`
Setting the Audio Format
AgoraAudioDataFrameProtocol
allows you to set audio parameters, such as the number of channels and sample rate. This is enabled by the helper functions getRecordAudioParams
,
getMixedAudioParams
, and getPlaybackAudioParams
, which again allow you to specify the audio mixing format for each of these events.
“`swift
// Implement the getRecordAudioParams callback, and set the audio recording format in the return value of this callback.
func getRecordAudioParams() -> AgoraAudioParam {
let param = AgoraAudioParam()
param.channel = 1
param.mode = .readOnly
param.sampleRate = 44100
param.samplesPerCall = 1024
return param
}
// Implement the getMixedAudioParams callback, and set the mixed audio format in the return value of this callback.
func getMixedAudioParams() -> AgoraAudioParam {
let param = AgoraAudioParam()
param.channel = 1
param.mode = .readOnly
param.sampleRate = 44100
param.samplesPerCall = 1024
return param
}
// Implement the getMixedAudioParams callback, and set the playback audio format in the return value of this callback.
func getPlaybackAudioParams() -> AgoraAudioParam {
let param = AgoraAudioParam()
param.channel = 1
param.mode = .readOnly
param.sampleRate = 44100
param.samplesPerCall = 1024
return param
}
“`
Tidying Up
It’s important to clean up when you exit the application to prevent problems like memory leaks. To do that here you’ll need to add the below code to ViewController.swift
.
“`swift
override func viewDidDisappear(_ animated: Bool) {
super.viewDidDisappear(animated)
agoraKit?.leaveChannel(nil)
AgoraRtcEngineKit.destroy()
agoraKit?.setAudioDataFrame(nil)
}
“`
Integration Test
To test what you’ve completed, navigate to the demo call, and enter your app ID, token, and channel name.
When the app is running, you should be able to hear your own voice through the demo.
Processing Audio Data
You’ve created an audio app with this tutorial, but you can decide to go further with the audio data. You can process or analyze your audio data to add functionality beyond what we’ve done in our tutorial.
There are two main ways to process raw audio data:
- Pre-processing: Processes the audio/visual data before encoding
- Post-processing: Processes the audio/visual data after encoding
Audio processing is complex, and if you’re interested in it, it’s worth looking at the Agora documentation, where there’s a tutorial about how to process the audio data you retrieve. You could also review the example app on GitHub.
Next Steps for using Raw Audio Data
In this post, you’ve learned how to build an iOS app using the Agora iOS SDK that can be used to broadcast to a channel. If you wanted to take the app further, you could add features like allowing multiple hosts or user responses. While a simple broadcasting app might be straightforward, as you add features and users, you can rapidly end up with far more data than you can manage on your own. To make sense of it all, you’ll want something like Symbl.ai conversation intelligence.
Symbl offers APIs that are easy to build and deploy and can be utilized on speech-, text-, or video-driven applications to provide services like analytics and engagement metrics, real-time transcription in over twenty languages, and content moderation. It can also generate post-conversation summaries that can include things such as topics discussed, questions asked and answered, plans made, and even items that need to be followed up on.
Symbl.ai leverages artificial intelligence and machine learning technologies to add value to both real-time and asynchronous exchanges. Their developer-friendly APIs and multi-language SDKs make it easy to add functionality to your applications, and don’t require you to develop or train your own models—they work right out of the box. Take advantage of both Agora and Symbl.ai to offer users feature-rich and real-time experiences.
READ MORE: How to Get a Raw Audio Stream from Agora.io using its Web SDK