Medical Transcription API

Last updated: April 16, 2025

Medical Transcription API

The Nuxera Medical Transcription API converts audio recordings of medical consultations into accurate text transcripts with structured clinical information. This API is specifically designed to handle medical terminology and the natural flow of doctor-patient conversations.

Endpoint

URL: /api/transcribe

Method: POST

Content-Type: multipart/form-data

Authentication

Include your bearer token in the Authorization header:

Authorization: Bearer <YOUR_TOKEN>

Request Parameters

ParameterTypeRequiredDescription
audioFileYesAudio file in WAV format
doctorNamestringYesName of the doctor conducting the consultation
patientNamestringNoName of the patient (improves accuracy)
removeSilencestringNoSet to "true" to remove silence from audio (improves processing)
retrystringNoSet to "true" to reprocess existing audio without uploading again

Response

Success Response (200 OK)

{
  "transcription": "Doctor: Hello, how are you feeling today?\nPatient: I've been having severe headaches for the past week, especially in the morning.\nDoctor: I see. Have you noticed any other symptoms along with the headaches?\nPatient: Yes, I sometimes feel nauseous, and bright lights bother me.",
  "classifiedInfo": {
    "ICD-10 Code": [
      "G43.909 - Migraine, unspecified, not intractable, without status migrainosus"
    ],
    "Chief Complaint": ["Severe headaches for one week, worse in the morning"],
    "History Of Present Illness": "Patient reports severe headaches for the past week, particularly in the morning. Associated symptoms include nausea and photophobia. No prior history of migraines mentioned.",
    "Current Medications": ["None mentioned"],
    "Imaging Results": [],
    "Assessment": [
      "Patient presents with symptoms consistent with migraine headaches including photophobia and nausea"
    ],
    "Diagnosis": ["Migraine headache"],
    "Prescriptions": [
      {
        "Medication": "Sumatriptan",
        "Dosage": "50mg as needed for migraine attacks"
      }
    ],
    "Plan": [
      "Start Sumatriptan for acute attacks",
      "Avoid triggers including bright lights",
      "Keep headache diary"
    ],
    "Follow-up": ["2 weeks"]
  }
}

Error Response (400 Bad Request)

{
  "error": "Invalid request",
  "details": "Doctor name is required"
}

Error Response (500 Internal Server Error)

{
  "error": "Failed to transcribe audio"
}

Response Structure

The API response contains two main components:

1. Transcription

A text representation of the conversation with speakers labeled ("Doctor:" and "Patient:"). The transcription preserves the natural flow of the conversation and includes all relevant clinical details.

2. Classified Information

The API automatically extracts and categorizes clinical information into standard sections:

FieldDescriptionExample
ICD-10 CodeStandard diagnostic codes"J45.909 - Unspecified asthma, uncomplicated"
Chief ComplaintPrimary reason for visit"Shortness of breath for 3 days"
History Of Present IllnessDetailed symptom history"Patient reports developing shortness of breath 3 days ago..."
Current MedicationsActive medications["Albuterol inhaler", "Lisinopril 10mg daily"]
Imaging ResultsFindings from any imaging["Chest X-ray shows no acute abnormalities"]
AssessmentClinical evaluation["Symptoms consistent with asthma exacerbation"]
DiagnosisMedical diagnoses["Acute asthma exacerbation"]
PrescriptionsNew or continued medications[{"Medication": "Prednisone", "Dosage": "40mg daily for 5 days"}]
PlanTreatment and next steps["Increase albuterol use", "Complete prednisone course"]
Follow-upRecommended follow-up timing["1 week"]

Example Usage

cURL

curl -X POST 'https://api.nuxera.ai/api/transcribe' \
  -H 'Authorization: Bearer YOUR_TOKEN' \
  -F 'audio=@consultation.wav' \
  -F 'doctorName=Dr. Smith' \
  -F 'patientName=John Doe' \
  -F 'removeSilence=true'

JavaScript

// Function to transcribe a medical consultation
async function transcribeConsultation(
  token,
  audioFile,
  doctorName,
  patientName
) {
  const formData = new FormData();
  formData.append('audio', audioFile);
  formData.append('doctorName', doctorName);

  if (patientName) {
    formData.append('patientName', patientName);
  }

  formData.append('removeSilence', 'true'); // Optional

  try {
    const response = await fetch('https://api.nuxera.ai/api/transcribe', {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${token}`,
      },
      body: formData,
    });

    if (!response.ok) {
      const errorData = await response.json();
      throw new Error(errorData.error || 'Transcription failed');
    }

    return await response.json();
  } catch (error) {
    console.error('Transcription error:', error);
    throw error;
  }
}

// Example usage
document
  .getElementById('upload-form')
  .addEventListener('submit', async (event) => {
    event.preventDefault();

    const audioFile = document.getElementById('audio-file').files[0];
    const doctorName = document.getElementById('doctor-name').value;
    const patientName = document.getElementById('patient-name').value;

    try {
      const result = await transcribeConsultation(
        localStorage.getItem('authToken'),
        audioFile,
        doctorName,
        patientName
      );

      // Display transcription
      document.getElementById('transcription').textContent =
        result.transcription;

      // Process classified information
      displayClassifiedInfo(result.classifiedInfo);
    } catch (error) {
      alert(`Error: ${error.message}`);
    }
  });

Python

import requests

def transcribe_consultation(token, audio_file_path, doctor_name, patient_name=None, remove_silence=True):
    """
    Transcribe a medical consultation using the Nuxera API

    Args:
        token (str): Authentication token
        audio_file_path (str): Path to the audio file
        doctor_name (str): Name of the doctor
        patient_name (str, optional): Name of the patient
        remove_silence (bool, optional): Whether to remove silence from the audio

    Returns:
        dict: API response containing transcription and classified information
    """
    url = 'https://api.nuxera.ai/api/transcribe'

    # Prepare form data
    files = {
        'audio': open(audio_file_path, 'rb')
    }

    data = {
        'doctorName': doctor_name,
        'removeSilence': 'true' if remove_silence else 'false'
    }

    if patient_name:
        data['patientName'] = patient_name

    # Set authorization header
    headers = {
        'Authorization': f'Bearer {token}'
    }

    # Make API request
    response = requests.post(url, headers=headers, files=files, data=data)

    # Check for errors
    if response.status_code != 200:
        error_data = response.json()
        raise Exception(error_data.get('error', 'Transcription failed'))

    return response.json()

# Example usage
try:
    token = "YOUR_AUTH_TOKEN"
    result = transcribe_consultation(
        token=token,
        audio_file_path="./consultation_recording.wav",
        doctor_name="Dr. Johnson",
        patient_name="Alice Smith"
    )

    # Print transcription
    print("TRANSCRIPTION:")
    print(result['transcription'])
    print("\nCLASSIFIED INFORMATION:")

    # Print classified information
    for category, data in result['classifiedInfo'].items():
        print(f"\n{category}:")
        if isinstance(data, list):
            for item in data:
                if isinstance(item, dict) and 'Medication' in item:
                    print(f"  - {item['Medication']}: {item.get('Dosage', 'N/A')}")
                else:
                    print(f"  - {item}")
        else:
            print(f"  {data}")

except Exception as e:
    print(f"Error: {str(e)}")

Best Practices

  1. Audio Quality

    • Record in a quiet environment with minimal background noise
    • Use a good quality microphone positioned close to the speakers
    • Aim for clear speech with minimal overlapping conversation
  2. File Format

    • Use WAV format for optimal quality
    • Keep files under 100MB for faster processing
    • For longer consultations, consider splitting into multiple recordings
  3. Patient Privacy

    • Always ensure you have appropriate consent for recording and processing patient conversations
    • Follow applicable healthcare privacy regulations (HIPAA, GDPR, etc.)
    • Use secure storage for all transcription data
  4. Optimizing Results

    • Always provide the doctor's name
    • Include the patient's name when available
    • Use the removeSilence option to improve processing efficiency
    • For failed transcriptions, use the retry option instead of re-uploading large files
  5. Post-Processing

    • Review the transcription for accuracy, especially for critical medical information
    • Verify medication names and dosages in the classified information
    • Confirm that ICD-10 codes accurately reflect the diagnoses

Error Handling

Error CodeDescriptionResolution
400Missing required fieldsEnsure all required parameters are included
400Invalid audio formatUpload audio in WAV format
413Audio file too largeReduce file size or split recording
500Transcription failedCheck audio quality and try again
500Classification failedRetry with better audio or more complete conversation

Limitations

  • Maximum audio file size: 100MB
  • Maximum recording length: 60 minutes
  • Supported languages: English and Arabic
  • Supported audio formats: WAV (others will be automatically converted)

Next Steps

Continue to the Dictation API documentation to learn about specialized medical dictation processing.