Electronic health records (EHRs) contain abundant unstructured information — physician notes, consult reports, and symptom descriptions — which may hold critical clues for diagnosing rare diseases. Natural language processing (NLP) techniques extract clinical entities, map terms to standard ontologies such as the Human Phenotype Ontology (HPO), and convert free text into usable features. NLP systems can reveal recurring symptom patterns, track disease progression, and highlight previously overlooked clinical suspicions. Challenges include linguistic variability across institutions and inconsistent documentation formats, necessitating adaptable models and extensive lexicons. Privacy protection and regulatory compliance are also major concerns when using sensitive texts. Integrating NLP-extracted features with genomic analyses enables models to propose prioritized diagnostic candidates, aiding clinical decision-making.