Silvia OnofreiinTowards Data ScienceRelation Extraction with Llama3 ModelsEnhanced relation extraction by fine-tuning Llama3–8B with a synthetic dataset created using Llama3–70B12 min read·Apr 26, 2024--2--2
Silvia OnofreiinTowards Data ScienceCypher Generation: The Good, The Bad and The MessyMethods for creating fine-tuning datasets for text-to-Cypher generation.13 min read·Jan 29, 2024----
Silvia OnofreiinTowards Data ScienceLeverage KeyBERT, HDBSCAN and Zephyr-7B-Beta to Build a Knowledge GraphLLM-enhanced natural language processing and traditional machine learning techniques are used to extract structure and to build a knowledge…19 min read·Jan 7, 2024--8--8
Silvia OnofreiinTowards Data ScienceTransforming text into vectors: TSDAE’s unsupervised approach to enhanced embeddingsCombine TSDAE pre-training on a target domain with supervised fine-tuning on a general-purpose corpus to enhance the quality of the…11 min read·Oct 16, 2023----
Silvia OnofreiCode Llama’s “Knowledge” of Neo4j’s Cypher Query LanguageA simple experiment into how does Code Llama with Neo4j’s query language Cyper7 min read·Aug 28, 2023--2--2
Silvia OnofreiTopic Modeling with Healthcare Spark NLPHow to leverage Healthcare Spark NLP pretrained models to categorize a small collection of publications on equine colic11 min read·May 26, 2022----
Silvia OnofreiDid Stacking Improve My PySpark Churn Prediction Model?Stacking with PySpark to predict customer churn for a fictional music platform Sparkify.8 min read·Feb 17, 2022----
Silvia OnofreiUser Activity Based Churn Prediction With PySpark on an AWS-EMR ClusterAnalyze and predict customer churn for a fictional music platform Sparkify.22 min read·Feb 17, 2022--1--1
Silvia OnofreiWeb Scraping Mini ProjectExtract Text From Udacity Course Catalog Website Using Beautiful Soup and Selenium5 min read·Sep 21, 2021----
Silvia OnofreiWho Are the Data Professionals?Three way analysis of the StackOverflow Developers Annual Survey.9 min read·Aug 18, 2021--1--1