Medical Transcription API

The Nuxera Medical Transcription API converts audio recordings of medical consultations into accurate text transcripts with structured clinical information. This API is specifically designed to handle medical terminology and the natural flow of doctor-patient conversations.

Endpoint

URL: /api/transcribe

Method: POST

Content-Type: multipart/form-data

Authentication

Include your bearer token in the Authorization header:

Authorization: Bearer <YOUR_TOKEN>

Request Parameters

Parameter	Type	Required	Description
audio	File	Yes	Audio file in WAV format
doctorName	string	Yes	Name of the doctor conducting the consultation
patientName	string	No	Name of the patient (improves accuracy)
removeSilence	string	No	Set to "true" to remove silence from audio (improves processing)
retry	string	No	Set to "true" to reprocess existing audio without uploading again

Response

Success Response (200 OK)

{
  "transcription": "Doctor: Hello, how are you feeling today?\nPatient: I've been having severe headaches for the past week, especially in the morning.\nDoctor: I see. Have you noticed any other symptoms along with the headaches?\nPatient: Yes, I sometimes feel nauseous, and bright lights bother me.",
  "classifiedInfo": {
    "ICD-10 Code": [
      "G43.909 - Migraine, unspecified, not intractable, without status migrainosus"
    ],
    "Chief Complaint": ["Severe headaches for one week, worse in the morning"],
    "History Of Present Illness": "Patient reports severe headaches for the past week, particularly in the morning. Associated symptoms include nausea and photophobia. No prior history of migraines mentioned.",
    "Current Medications": ["None mentioned"],
    "Imaging Results": [],
    "Assessment": [
      "Patient presents with symptoms consistent with migraine headaches including photophobia and nausea"
    ],
    "Diagnosis": ["Migraine headache"],
    "Prescriptions": [
      {
        "Medication": "Sumatriptan",
        "Dosage": "50mg as needed for migraine attacks"
      }
    ],
    "Plan": [
      "Start Sumatriptan for acute attacks",
      "Avoid triggers including bright lights",
      "Keep headache diary"
    ],
    "Follow-up": ["2 weeks"]
  }
}

Error Response (400 Bad Request)

{
  "error": "Invalid request",
  "details": "Doctor name is required"
}

Error Response (500 Internal Server Error)

{
  "error": "Failed to transcribe audio"
}

Response Structure

The API response contains two main components:

1. Transcription

A text representation of the conversation with speakers labeled ("Doctor:" and "Patient:"). The transcription preserves the natural flow of the conversation and includes all relevant clinical details.

2. Classified Information

The API automatically extracts and categorizes clinical information into standard sections:

Field	Description	Example
ICD-10 Code	Standard diagnostic codes	"J45.909 - Unspecified asthma, uncomplicated"
Chief Complaint	Primary reason for visit	"Shortness of breath for 3 days"
History Of Present Illness	Detailed symptom history	"Patient reports developing shortness of breath 3 days ago..."
Current Medications	Active medications	["Albuterol inhaler", "Lisinopril 10mg daily"]
Imaging Results	Findings from any imaging	["Chest X-ray shows no acute abnormalities"]
Assessment	Clinical evaluation	["Symptoms consistent with asthma exacerbation"]
Diagnosis	Medical diagnoses	["Acute asthma exacerbation"]
Prescriptions	New or continued medications	[{"Medication": "Prednisone", "Dosage": "40mg daily for 5 days"}]
Plan	Treatment and next steps	["Increase albuterol use", "Complete prednisone course"]
Follow-up	Recommended follow-up timing	["1 week"]

Example Usage

cURL

curl -X POST 'https://api.nuxera.ai/api/transcribe' \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -F 'audio=@consultation.wav' \
  -F 'doctorName=Dr. Smith' \
  -F 'patientName=John Doe' \
  -F 'removeSilence=true'

JavaScript

// Function to transcribe a medical consultation
async function transcribeConsultation(
  token,
  audioFile,
  doctorName,
  patientName
) {
  const formData = new FormData();
  formData.append('audio', audioFile);
  formData.append('doctorName', doctorName);

  if (patientName) {
    formData.append('patientName', patientName);
  }

  formData.append('removeSilence', 'true'); // Optional

  try {
    const response = await fetch('https://api.nuxera.ai/api/transcribe', {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${token}`,
      },
      body: formData,
    });

    if (!response.ok) {
      const errorData = await response.json();
      throw new Error(errorData.error || 'Transcription failed');
    }

    return await response.json();
  } catch (error) {
    console.error('Transcription error:', error);
    throw error;
  }
}

// Example usage
document
  .getElementById('upload-form')
  .addEventListener('submit', async (event) => {
    event.preventDefault();

    const audioFile = document.getElementById('audio-file').files[0];
    const doctorName = document.getElementById('doctor-name').value;
    const patientName = document.getElementById('patient-name').value;

    try {
      const result = await transcribeConsultation(
        localStorage.getItem('authToken'),
        audioFile,
        doctorName,
        patientName
      );

      // Display transcription
      document.getElementById('transcription').textContent =
        result.transcription;

      // Process classified information
      displayClassifiedInfo(result.classifiedInfo);
    } catch (error) {
      alert(`Error: ${error.message}`);
    }
  });

Python

import requests

def transcribe_consultation(token, audio_file_path, doctor_name, patient_name=None, remove_silence=True):
    """
    Transcribe a medical consultation using the Nuxera API

    Args:
        token (str): Authentication token
        audio_file_path (str): Path to the audio file
        doctor_name (str): Name of the doctor
        patient_name (str, optional): Name of the patient
        remove_silence (bool, optional): Whether to remove silence from the audio

    Returns:
        dict: API response containing transcription and classified information
    """
    url = 'https://api.nuxera.ai/api/transcribe'

    # Prepare form data
    files = {
        'audio': open(audio_file_path, 'rb')
    }

    data = {
        'doctorName': doctor_name,
        'removeSilence': 'true' if remove_silence else 'false'
    }

    if patient_name:
        data['patientName'] = patient_name

    # Set authorization header
    headers = {
        'Authorization': f'Bearer {token}'
    }

    # Make API request
    response = requests.post(url, headers=headers, files=files, data=data)

    # Check for errors
    if response.status_code != 200:
        error_data = response.json()
        raise Exception(error_data.get('error', 'Transcription failed'))

    return response.json()

# Example usage
try:
    token = "YOUR_AUTH_TOKEN"
    result = transcribe_consultation(
        token=token,
        audio_file_path="./consultation_recording.wav",
        doctor_name="Dr. Johnson",
        patient_name="Alice Smith"
    )

    # Print transcription
    print("TRANSCRIPTION:")
    print(result['transcription'])
    print("\nCLASSIFIED INFORMATION:")

    # Print classified information
    for category, data in result['classifiedInfo'].items():
        print(f"\n{category}:")
        if isinstance(data, list):
            for item in data:
                if isinstance(item, dict) and 'Medication' in item:
                    print(f"  - {item['Medication']}: {item.get('Dosage', 'N/A')}")
                else:
                    print(f"  - {item}")
        else:
            print(f"  {data}")

except Exception as e:
    print(f"Error: {str(e)}")

Best Practices

Audio Quality
- Record in a quiet environment with minimal background noise
- Use a good quality microphone positioned close to the speakers
- Aim for clear speech with minimal overlapping conversation
File Format
- Use WAV format for optimal quality
- Keep files under 100MB for faster processing
- For longer consultations, consider splitting into multiple recordings
Patient Privacy
- Always ensure you have appropriate consent for recording and processing patient conversations
- Follow applicable healthcare privacy regulations (HIPAA, GDPR, etc.)
- Use secure storage for all transcription data
Optimizing Results
- Always provide the doctor's name
- Include the patient's name when available
- Use the removeSilence option to improve processing efficiency
- For failed transcriptions, use the retry option instead of re-uploading large files
Post-Processing
- Review the transcription for accuracy, especially for critical medical information
- Verify medication names and dosages in the classified information
- Confirm that ICD-10 codes accurately reflect the diagnoses

Error Handling

Error Code	Description	Resolution
400	Missing required fields	Ensure all required parameters are included
400	Invalid audio format	Upload audio in WAV format
413	Audio file too large	Reduce file size or split recording
500	Transcription failed	Check audio quality and try again
500	Classification failed	Retry with better audio or more complete conversation

Limitations

Maximum audio file size: 100MB
Maximum recording length: 60 minutes
Supported languages: English and Arabic
Supported audio formats: WAV (others will be automatically converted)

Next Steps

Continue to the Dictation API documentation to learn about specialized medical dictation processing.

Transcription Documentation