René's URL Explorer Experiment


Title: Publications by Tag · Machine Learning for Big Code and Naturalness

Keywords:

direct link

Domain: ml4code.github.io

Nonetext/html; charset=utf-8

Links:

Contribute to ML4Codehttps://ml4code.github.io/contributing.html
Machine Learning for Big Code and Naturalness https://ml4code.github.io/
List of Papershttps://ml4code.github.io/papers.html
Papers by Taghttps://ml4code.github.io/tags.html
2D Map of Papershttps://ml4code.github.io/tsne-viz.html
Topic-based Explorerhttps://ml4code.github.io/topic-viz.html
Resources, Courses & Eventshttps://ml4code.github.io/resources.html
Contributinghttps://ml4code.github.io/contributing.html
Miltos Allamanishttps://miltos.allamanis.com
Jekyllhttps://jekyllrb.com
Hydehttps://github.com/poole/hyde
adversarialhttps://ml4code.github.io/tags.html#adversarial
APIhttps://ml4code.github.io/tags.html#API
autocompletehttps://ml4code.github.io/tags.html#autocomplete
benchmarkhttps://ml4code.github.io/tags.html#benchmark
benchmarkinghttps://ml4code.github.io/tags.html#benchmarking
bimodalhttps://ml4code.github.io/tags.html#bimodal
Binary Codehttps://ml4code.github.io/tags.html#Binary Code
clonehttps://ml4code.github.io/tags.html#clone
code completionhttps://ml4code.github.io/tags.html#code completion
code generationhttps://ml4code.github.io/tags.html#code generation
code similarityhttps://ml4code.github.io/tags.html#code similarity
compilationhttps://ml4code.github.io/tags.html#compilation
completionhttps://ml4code.github.io/tags.html#completion
cybersecurityhttps://ml4code.github.io/tags.html#cybersecurity
datasethttps://ml4code.github.io/tags.html#dataset
decompilationhttps://ml4code.github.io/tags.html#decompilation
defecthttps://ml4code.github.io/tags.html#defect
deobfuscationhttps://ml4code.github.io/tags.html#deobfuscation
documentationhttps://ml4code.github.io/tags.html#documentation
dynamichttps://ml4code.github.io/tags.html#dynamic
edithttps://ml4code.github.io/tags.html#edit
editinghttps://ml4code.github.io/tags.html#editing
educationhttps://ml4code.github.io/tags.html#education
evaluationhttps://ml4code.github.io/tags.html#evaluation
executionhttps://ml4code.github.io/tags.html#execution
feature locationhttps://ml4code.github.io/tags.html#feature location
fuzzinghttps://ml4code.github.io/tags.html#fuzzing
generalizabilityhttps://ml4code.github.io/tags.html#generalizability
generationhttps://ml4code.github.io/tags.html#generation
GNNhttps://ml4code.github.io/tags.html#GNN
grammarhttps://ml4code.github.io/tags.html#grammar
human evaluationhttps://ml4code.github.io/tags.html#human evaluation
information extractionhttps://ml4code.github.io/tags.html#information extraction
instruction tuninghttps://ml4code.github.io/tags.html#instruction tuning
interpretabilityhttps://ml4code.github.io/tags.html#interpretability
language modelhttps://ml4code.github.io/tags.html#language model
large language modelshttps://ml4code.github.io/tags.html#large language models
LLMhttps://ml4code.github.io/tags.html#LLM
logginghttps://ml4code.github.io/tags.html#logging
memorizationhttps://ml4code.github.io/tags.html#memorization
metricshttps://ml4code.github.io/tags.html#metrics
migrationhttps://ml4code.github.io/tags.html#migration
naminghttps://ml4code.github.io/tags.html#naming
natural language generationhttps://ml4code.github.io/tags.html#natural language generation
natural language processinghttps://ml4code.github.io/tags.html#natural language processing
notebookhttps://ml4code.github.io/tags.html#notebook
optimizationhttps://ml4code.github.io/tags.html#optimization
pattern mininghttps://ml4code.github.io/tags.html#pattern mining
plagiarism detectionhttps://ml4code.github.io/tags.html#plagiarism detection
pretraininghttps://ml4code.github.io/tags.html#pretraining
program analysishttps://ml4code.github.io/tags.html#program analysis
program synthesishttps://ml4code.github.io/tags.html#program synthesis
question answeringhttps://ml4code.github.io/tags.html#question answering
refactoringhttps://ml4code.github.io/tags.html#refactoring
repairhttps://ml4code.github.io/tags.html#repair
representationhttps://ml4code.github.io/tags.html#representation
retrievalhttps://ml4code.github.io/tags.html#retrieval
Reverse Engineeringhttps://ml4code.github.io/tags.html#Reverse Engineering
reviewhttps://ml4code.github.io/tags.html#review
searchhttps://ml4code.github.io/tags.html#search
statichttps://ml4code.github.io/tags.html#static
static analysishttps://ml4code.github.io/tags.html#static analysis
stylehttps://ml4code.github.io/tags.html#style
summarizationhttps://ml4code.github.io/tags.html#summarization
surveyhttps://ml4code.github.io/tags.html#survey
synthesishttps://ml4code.github.io/tags.html#synthesis
test generationhttps://ml4code.github.io/tags.html#test generation
toolhttps://ml4code.github.io/tags.html#tool
topic modelinghttps://ml4code.github.io/tags.html#topic modeling
topic modellinghttps://ml4code.github.io/tags.html#topic modelling
traceabilityhttps://ml4code.github.io/tags.html#traceability
Transformerhttps://ml4code.github.io/tags.html#Transformer
Transformershttps://ml4code.github.io/tags.html#Transformers
translationhttps://ml4code.github.io/tags.html#translation
typeshttps://ml4code.github.io/tags.html#types
variable misusehttps://ml4code.github.io/tags.html#variable misuse
verificationhttps://ml4code.github.io/tags.html#verification
vulnerabilityhttps://ml4code.github.io/tags.html#vulnerability
Adversarial Examples for Models of Codehttps://ml4code.github.io/publications/yefet2019adversarial/
Generating Adversarial Examples for Holding Robustness of Source Code Processing Modelshttps://ml4code.github.io/publications/zhang2020generating/
Adversarial Robustness for Codehttps://ml4code.github.io/publications/bielik2020adversarial/
Embedding Java Classes with code2vec: Improvements from Variable Obfuscationhttps://ml4code.github.io/publications/compton2020embedding/
On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformationshttps://ml4code.github.io/publications/rabin2021generalizability/
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completionhttps://ml4code.github.io/publications/schuster2021you/
Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022understanding/
Semantic Robustness of Models of Source Codehttps://ml4code.github.io/publications/henkel2020semantic/
Backdoors in Neural Models of Source Codehttps://ml4code.github.io/publications/ramakrishnan2020backdoors/
Lexical Statistical Machine Translation for Language Migrationhttps://ml4code.github.io/publications/nguyen2013lexical/
Statistical Learning Approach for Mining API Usage Mappings for Code Migrationhttps://ml4code.github.io/publications/nguyen2014statistical/
Parameter-Free Probabilistic API Mining across GitHubhttps://ml4code.github.io/publications/fowkes2016parameter/
Learning API Usages from Bytecode: A Statistical Approachhttps://ml4code.github.io/publications/nguyen2016learning/
Deep API Learninghttps://ml4code.github.io/publications/gu2016deep/
Mapping API Elements for Code Migration with Vector Representationshttps://ml4code.github.io/publications/nguyen2016mapping/
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learninghttps://ml4code.github.io/publications/gu2017deepam/
Function Assistant: A Tool for NL Querying of APIshttps://ml4code.github.io/publications/richardson2017function/
Learning Technical Correspondences in Technical Documentationhttps://ml4code.github.io/publications/richardson2017learning/
Exploring API Embedding for API Usages and Applicationshttps://ml4code.github.io/publications/nguyen2017exploring/
Finding Likely Errors with Bayesian Specificationshttps://ml4code.github.io/publications/murali2017finding/
Bayesian Sketch Learning for Program Synthesishttps://ml4code.github.io/publications/murali2017bayesian/
Polyglot Semantic Parsing in APIshttps://ml4code.github.io/publications/richardson2018polyglot/
Unsupervised Learning of API Aliasing Specificationshttps://ml4code.github.io/publications/ederhardt2019unsupervised/
SAR: Learning Cross-Language API Mappings with Little Knowledgehttps://ml4code.github.io/publications/bui2019learning/
Mining Likely Analogical APIs across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embeddinghttps://ml4code.github.io/publications/chen2019mining/
AutoPandas: neural-backed generators for program synthesishttps://ml4code.github.io/publications/bavishi2019autopandas/
Learning from Examples to Improve Code Completion Systemshttps://ml4code.github.io/publications/bruch2009learning/
On the Naturalness of Softwarehttps://ml4code.github.io/publications/hindle2012naturalness/
Code Completion with Statistical Language Modelshttps://ml4code.github.io/publications/raychev2014code/
Graph-based Statistical Language Model for Codehttps://ml4code.github.io/publications/nguyen2015graph/
Intelligent Code Completion with Bayesian Networkshttps://ml4code.github.io/publications/proksch2015intelligent/
Learning Python Code Suggestion with a Sparse Pointer Networkhttps://ml4code.github.io/publications/bhoopchand2016learning/
Neural Code Completionhttps://ml4code.github.io/publications/wang2016neural/
Code Completion with Neural Attention and Pointer Networkshttps://ml4code.github.io/publications/li2017code/
Pythia: AI-assisted Code Completion Systemhttps://ml4code.github.io/publications/svyatkovskiy2019pythia/
Learning Autocompletion from Real-World Datasetshttps://ml4code.github.io/publications/aye2020learning/
Sequence Model Design for Code Completion in the Modern IDEhttps://ml4code.github.io/publications/aye2020sequence/
Code Prediction by Feeding Trees to Transformershttps://ml4code.github.io/publications/kim2020code/
A Structural Model for Contextual Code Changeshttps://ml4code.github.io/publications/brody2020structural/
IntelliCode Compose: Code Generation Using Transformerhttps://ml4code.github.io/publications/svyatkovskiy2020intellicode/
Fast and Memory-Efficient Neural Code Completionhttps://ml4code.github.io/publications/svyatkovskiy2020fast/
On-the-Fly Adaptation of Source Code Models using Meta-Learninghttps://ml4code.github.io/publications/shrivastava2020on-the-fly/
Suggesting Comment Completions for Python using Neural Language Modelshttps://ml4code.github.io/publications/ciurumelea2020suggesting/
Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Modelshttps://ml4code.github.io/publications/li2021toward/
Learning to Extend Program Graphs to Work-in-Progress Codehttps://ml4code.github.io/publications/li2021learning/
Improving Code Autocompletion with Transfer Learninghttps://ml4code.github.io/publications/zhou2021improving/
You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completionhttps://ml4code.github.io/publications/schuster2021you/
On the Embeddings of Variables in Recurrent Neural Networks for Source Codehttps://ml4code.github.io/publications/chirkova2021embeddings/
ReACC: A Retrieval-Augmented Code Completion Frameworkhttps://ml4code.github.io/publications/lu2022reacc/
All You Need Is Logs: Improving Code Completion by Learning from Anonymous IDE Usage Logshttps://ml4code.github.io/publications/bibaev2022all/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
ConTest: A Unit Test Completion Benchmark featuring Contexthttps://ml4code.github.io/publications/villmow2021contest/
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generationhttps://ml4code.github.io/publications/lu2021codexglue/
Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsinghttps://ml4code.github.io/publications/patil2022exploring/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Modelshttps://ml4code.github.io/publications/chen2024ppm/
Natural Language Models for Predicting Programming Commentshttps://ml4code.github.io/publications/movshovitz2013natural/
Using Semantic Unification to Generate Regular Expressions from Natural Languagehttps://ml4code.github.io/publications/kushman2013using/
NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulationhttps://ml4code.github.io/publications/gulwani2014nlyze/
Synthesizing Java expressions from free-form querieshttps://ml4code.github.io/publications/gvero2015synthesizing/
Learning to Generate Pseudo-code from Source Code using Statistical Machine Translationhttps://ml4code.github.io/publications/oda2015learning/
A Bimodal Modelling of Source Code and Natural Languagehttps://ml4code.github.io/publications/allamanis2015bimodal/
Summarizing Source Code using a Neural Attention Modelhttps://ml4code.github.io/publications/iyer2016summarizing/
Latent Predictor Networks for Code Generationhttps://ml4code.github.io/publications/ling2016latent/
CodeSum: Translate Program Language to Natural Languagehttps://ml4code.github.io/publications/hu2017codesum/
Automatically Generating Commit Messages from Diffs using Neural Machine Translationhttps://ml4code.github.io/publications/jiang2017automatically/
Program Synthesis from Natural Language Using Recurrent Neural Networkshttps://ml4code.github.io/publications/lin2017program/
pix2code: Generating Code from a Graphical User Interface Screenshothttps://ml4code.github.io/publications/beltramelli2017pix2code/
Function Assistant: A Tool for NL Querying of APIshttps://ml4code.github.io/publications/richardson2017function/
The Code2Text Challenge: Text Generation in Source Code Librarieshttps://ml4code.github.io/publications/richardson2017code2text/
A Syntactic Neural Model for General-Purpose Code Generationhttps://ml4code.github.io/publications/yin2017syntactic/
Learning Technical Correspondences in Technical Documentationhttps://ml4code.github.io/publications/richardson2017learning/
Generating Regular Expressions from Natural Language Specifications: Are We There Yet?https://ml4code.github.io/publications/zhong2018generating/
Mapping Language to Code in Programmatic Contexthttps://ml4code.github.io/publications/iyer2018mapping/
Deep Learning to Detect Redundant Method Commentshttps://ml4code.github.io/publications/louis2018deep/
NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating Systemhttps://ml4code.github.io/publications/lin2018nl2bash/
Polyglot Semantic Parsing in APIshttps://ml4code.github.io/publications/richardson2018polyglot/
A Retrieve-and-Edit Framework for Predicting Structured Outputshttps://ml4code.github.io/publications/hashimoto2018retrieve/
TypeWriter: Neural Type Prediction with Search-based Validationhttps://ml4code.github.io/publications/pradel2019typewriter/
SPoC: Search-based Pseudocode to Codehttps://ml4code.github.io/publications/kulal2019spoc/
JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generationhttps://ml4code.github.io/publications/agashe2019julce/
Learning Uniform Semantic Features for Natural Language and Programming Language Globally, Locally and Sequentiallyhttps://ml4code.github.io/publications/zhang2019learning/
NL2Type: Inferring JavaScript Function Types from Natural Language Informationhttps://ml4code.github.io/publications/malik2019nl2type/
OptTyper: Probabilistic Type Inference by Optimising Logical and Natural Constraintshttps://ml4code.github.io/publications/pandi2020opttyper/
Incorporating External Knowledge through Pre-training for Natural Language to Code Generationhttps://ml4code.github.io/publications/xu2020incorporating/
Associating Natural Language Comment and Source Code Entitieshttps://ml4code.github.io/publications/panthaplackel2020associating/
TAG : Type Auxiliary Guiding for Code Comment Generationhttps://ml4code.github.io/publications/cai2020tag/
Deep Just-In-Time Inconsistency Detection Between Comments and Source Codehttps://ml4code.github.io/publications/panthaplackel2020deep/
Code to Comment "Translation": Data, Metrics, Baselining & Evaluationhttps://ml4code.github.io/publications/gros2020code/
Learning to Update Natural Language Comments Based on Code Changeshttps://ml4code.github.io/publications/panthaplackel2020learning/
PyMT5: multi-mode translation of natural language and Python code with transformershttps://ml4code.github.io/publications/clement2020pymt5/
Where should I comment my code? A dataset and model for predicting locations that need commentshttps://ml4code.github.io/publications/louis2020where/
Suggesting Comment Completions for Python using Neural Language Modelshttps://ml4code.github.io/publications/ciurumelea2020suggesting/
Co-Training for Commit Classificationhttps://ml4code.github.io/publications/lee2021cotraining/
Learning to Reverse DNNs from AI Programs Automaticallyhttps://ml4code.github.io/publications/chen2022learning/
Deep Learning Code Fragments for Code Clone Detectionhttps://ml4code.github.io/publications/white2016deep/
Oreo: detection of clones in the twilight zonehttps://ml4code.github.io/publications/saini2018oreo/
Deep Learning Similarities from Different Representations of Source Codehttps://ml4code.github.io/publications/tufano2018deep/
Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimizationhttps://ml4code.github.io/publications/ding2019asm2vec/
Learning-based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detectionhttps://ml4code.github.io/publications/buech2019learning/
funcGNN: A Graph Neural Network Approach to Program Similarityhttps://ml4code.github.io/publications/nair2020funcgnn/
Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Treehttps://ml4code.github.io/publications/wang2020detecting/
Modeling Functional Similarity in Source Code with Graph-Based Siamese Networkshttps://ml4code.github.io/publications/mehrotra2020modeling/
Cross-Language Binary-Source Code Matching with Intermediate Representationshttps://ml4code.github.io/publications/gui2022cross/
An Exploratory Study on Code Attention in BERThttps://ml4code.github.io/publications/sharma2022exploratory/
Repository-Level Prompt Generation for Large Language Models of Codehttps://ml4code.github.io/publications/shrivastava2020repository/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
A Machine Learning Framework for Programming by Examplehttps://ml4code.github.io/publications/menon2013machine/
Using Semantic Unification to Generate Regular Expressions from Natural Languagehttps://ml4code.github.io/publications/kushman2013using/
Structured Generative Models of Natural Source Codehttps://ml4code.github.io/publications/maddison2014structured/
Code Completion with Statistical Language Modelshttps://ml4code.github.io/publications/raychev2014code/
NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulationhttps://ml4code.github.io/publications/gulwani2014nlyze/
Phrase-Based Statistical Translation of Programming Languageshttps://ml4code.github.io/publications/karaivanov2014phrase/
Synthesizing Java expressions from free-form querieshttps://ml4code.github.io/publications/gvero2015synthesizing/
Visualizing and Understanding Recurrent Networkshttps://ml4code.github.io/publications/karpathy2015visualizing/
A deep language model for software codehttps://ml4code.github.io/publications/dam2016deep/
Learning Programs from Noisy Datahttps://ml4code.github.io/publications/raychev2016learning/
PHOG: Probabilistic Model for Codehttps://ml4code.github.io/publications/bielik2016phog/
Latent Predictor Networks for Code Generationhttps://ml4code.github.io/publications/ling2016latent/
Program Synthesis from Natural Language Using Recurrent Neural Networkshttps://ml4code.github.io/publications/lin2017program/
pix2code: Generating Code from a Graphical User Interface Screenshothttps://ml4code.github.io/publications/beltramelli2017pix2code/
A Syntactic Neural Model for General-Purpose Code Generationhttps://ml4code.github.io/publications/yin2017syntactic/
Neural Attribute Machines for Program Generationhttps://ml4code.github.io/publications/amodio2017neural/
Abstract Syntax Networks for Code Generation and Semantic Parsinghttps://ml4code.github.io/publications/rabinovich2017abstract/
Synthesizing benchmarks for predictive modelinghttps://ml4code.github.io/publications/cummins2017synthesizing/
DeepFix: Fixing Common C Language Errors by Deep Learninghttps://ml4code.github.io/publications/gupta2017deepfix/
Deep Reinforcement Learning for Programming Language Correctionhttps://ml4code.github.io/publications/gupta2018deep/
Bayesian Sketch Learning for Program Synthesishttps://ml4code.github.io/publications/murali2017bayesian/
Compiler Fuzzing through Deep Learninghttps://ml4code.github.io/publications/cummins2018compiler/
Generating Regular Expressions from Natural Language Specifications: Are We There Yet?https://ml4code.github.io/publications/zhong2018generating/
Mapping Language to Code in Programmatic Contexthttps://ml4code.github.io/publications/iyer2018mapping/
NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating Systemhttps://ml4code.github.io/publications/lin2018nl2bash/
CODIT: Code Editing with Tree-Based Neural Machine Translationhttps://ml4code.github.io/publications/chakraborty2018tree2tree/
A Retrieve-and-Edit Framework for Predicting Structured Outputshttps://ml4code.github.io/publications/hashimoto2018retrieve/
Learning to Generate Corrective Patches using Neural Machine Translationhttps://ml4code.github.io/publications/hata2018learning/
Learning to Repair Software Vulnerabilities with Generative Adversarial Networkshttps://ml4code.github.io/publications/harer2018learning/
SampleFix: Learning to Correct Programs by Sampling Diverse Fixeshttps://ml4code.github.io/publications/hajipour2019samplefix/
A Grammar-Based Structural CNN Decoder for Code Generationhttps://ml4code.github.io/publications/sun2019grammar/
SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repairhttps://ml4code.github.io/publications/chen2019sequencer/
Generative Code Modeling with Graphshttps://ml4code.github.io/publications/brockschmidt2019generative/
Structural Language Models for Any-Code Generationhttps://ml4code.github.io/publications/alon2019structural/
Code Generation as a Dual Task of Code Summarizationhttps://ml4code.github.io/publications/wei2019code/
DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testinghttps://ml4code.github.io/publications/liu2019deepfuzz/
A case study on machine learning for synthesizing benchmarkshttps://ml4code.github.io/publications/goens2019case/
Learning Programmatic Idioms for Scalable Semantic Parsinghttps://ml4code.github.io/publications/iyer2019learning/
Incorporating External Knowledge through Pre-training for Natural Language to Code Generationhttps://ml4code.github.io/publications/xu2020incorporating/
Semantic Scaffolds for Pseudocode-to-Code Generationhttps://ml4code.github.io/publications/zhong2020semantic/
Unit Test Case Generation with Transformershttps://ml4code.github.io/publications/tufano2020unit/
Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformershttps://ml4code.github.io/publications/tufano2020generating/
PyMT5: multi-mode translation of natural language and Python code with transformershttps://ml4code.github.io/publications/clement2020pymt5/
IntelliCode Compose: Code Generation Using Transformerhttps://ml4code.github.io/publications/svyatkovskiy2020intellicode/
Retrieval Augmented Code Generation and Summarizationhttps://ml4code.github.io/publications/parvez2021retrieval/
Energy-Based Models for Code Generation under Compilability Constraintshttps://ml4code.github.io/publications/korbak2021energy/
Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchyhttps://ml4code.github.io/publications/clement2021long/
Time-Efficient Code Completion Model for the R Programming Languagehttps://ml4code.github.io/publications/popov2021time/
Shellcode_IA32: A Dataset for Automatic Shellcode Generationhttps://ml4code.github.io/publications/liguori2021shellcode_ia32/
TOGA: A Neural Method for Test Oracle Generationhttps://ml4code.github.io/publications/dinella2022toga/
InCoder: A Generative Model for Code Infilling and Synthesishttps://ml4code.github.io/publications/fried2022incoder/
DocCoder: Generating Code by Retrieving and Reading Docshttps://ml4code.github.io/publications/zhou2022docoder/
Human perceiving behavior modeling in evaluation of code generation modelshttps://ml4code.github.io/publications/kovalchuk2022human/
Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Modelshttps://ml4code.github.io/publications/vaithilingam2022expectation/
Open-ended Knowledge Tracinghttps://ml4code.github.io/publications/liu2022open/
Test-based and metric-based evaluation of code generation models for practical question answeringhttps://ml4code.github.io/publications/kovalchuk2023test/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
MISIM: An End-to-End Neural Code Similarity Systemhttps://ml4code.github.io/publications/ye2020misim/
Senatus - A Fast and Accurate Code-to-Code Recommendation Enginehttps://ml4code.github.io/publications/silavong2022senatus/
Cross-Language Binary-Source Code Matching with Intermediate Representationshttps://ml4code.github.io/publications/gui2022cross/
CV4Code: Sourcecode Understanding via Visual Code Representationshttps://ml4code.github.io/publications/shi2022cv4code/
Can Large Language Model Detect Plagiarism in Source Code?https://ml4code.github.io/publications/brach2024can/
DeepDelta: Learning to Repair Compilation Errorshttps://ml4code.github.io/publications/mesbah2019deepdelta/
A Neural Approach to Decompiled Identifier Renaminghttps://ml4code.github.io/publications/lacomis2019neural/
Static Neural Compiler Optimization via Deep Reinforcement Learninghttps://ml4code.github.io/publications/mammadli2020static/
ComPy-Learn: A toolbox for exploring machine learning representations for compilershttps://ml4code.github.io/publications/brauckmann2020compy/
Compiler-based graph representations for deep learning models of codehttps://ml4code.github.io/publications/brauckmann2020compiler/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generationhttps://ml4code.github.io/publications/zhang2023repocoder/
RepoFusion: Training Code Models to Understand Your Repositoryhttps://ml4code.github.io/publications/shrivastava2023repofusion/
A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Taskshttps://ml4code.github.io/publications/casey2024survey/
A parallel corpus of Python functions and documentation strings for automated code documentation and code generationhttps://ml4code.github.io/publications/barone2017parallel/
StaQC: A Systematically Mined Question-Code Dataset from Stack Overflowhttps://ml4code.github.io/publications/yao2018staqc/
Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflowhttps://ml4code.github.io/publications/yin2018mining/
Public Git Archive: a Big Code dataset for allhttps://ml4code.github.io/publications/markovtsev2018public/
CodeSearchNet Challenge: Evaluating the State of Semantic Code Searchhttps://ml4code.github.io/publications/husain2019codesearchnet/
JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generationhttps://ml4code.github.io/publications/agashe2019julce/
Neural Code Search Evaluation Datasethttps://ml4code.github.io/publications/li2019neural/
Recommendations for Datasets for Source Code Summarizationhttps://ml4code.github.io/publications/leclair2019recommendations/
The Adverse Effects of Code Duplication in Machine Learning Models of Codehttps://ml4code.github.io/publications/allamanis2019adverse/
Graph4Code: A Machine Interpretable Knowledge Graph for Codehttps://ml4code.github.io/publications/abdelaziz2020graph4code/
Associating Natural Language Comment and Source Code Entitieshttps://ml4code.github.io/publications/panthaplackel2020associating/
Code and Named Entity Recognition in StackOverflowhttps://ml4code.github.io/publications/tabassum2020code/
ProGraML: Graph-based Deep Learning for Program Optimization and Analysishttps://ml4code.github.io/publications/cummins2020programl/
Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Sizehttps://ml4code.github.io/publications/monperrus2021megadiff/
CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Modelhttps://ml4code.github.io/publications/jung2021commitbert/
CoSQA: 20,000+ Web Queries for Code Search and Question Answeringhttps://ml4code.github.io/publications/huang2021cosqa/
ConTest: A Unit Test Completion Benchmark featuring Contexthttps://ml4code.github.io/publications/villmow2021contest/
A large-scale benchmark for few-shot program induction and synthesishttps://ml4code.github.io/publications/alet2021largescale/
Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generationhttps://ml4code.github.io/publications/orlanski2021reading/
Time-Efficient Code Completion Model for the R Programming Languagehttps://ml4code.github.io/publications/popov2021time/
ManyTypes4Py: A Benchmark Python Dataset for Machine Learning-based Type Inferencehttps://ml4code.github.io/publications/mir2021manytypes4py/
Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Taskshttps://ml4code.github.io/publications/puri2021project/
Shellcode_IA32: A Dataset for Automatic Shellcode Generationhttps://ml4code.github.io/publications/liguori2021shellcode_ia32/
Impact of Evaluation Methodologies on Code Summarizationhttps://ml4code.github.io/publications/nie2021evaluation/
Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Datahttps://ml4code.github.io/publications/hazoom2021text/
The Stack: 3TB of permissively licensed source codehttps://ml4code.github.io/publications/kocetkov2022stack/
Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptionshttps://ml4code.github.io/publications/bieber2022static/
Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsinghttps://ml4code.github.io/publications/patil2022exploring/
JEMMA: An Extensible Java Dataset for ML4Code Applicationshttps://ml4code.github.io/publications/karmakar2022jemma/
OctoPack: Instruction Tuning Code Large Language Modelshttps://ml4code.github.io/publications/muennighoff2023octopack/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detectionhttps://ml4code.github.io/publications/chen2023diversevul/
Learning to Align the Source Code to the Compiled Object Codehttps://ml4code.github.io/publications/levy2017learning/
Towards Neural Decompilationhttps://ml4code.github.io/publications/katz2019towards/
Coda: An End-to-End Neural Program Decompilerhttps://ml4code.github.io/publications/fu2019coda/
DIRECT : A Transformer-based Model for Decompiled Identifier Renaminghttps://ml4code.github.io/publications/nitin2021direct/
Code Translation with Compiler Representationshttps://ml4code.github.io/publications/szafraniec2022code/
LLM4Decompile: Decompiling Binary Code with Large Language Modelshttps://ml4code.github.io/publications/tan2024llm4decompile/
Using Web Corpus Statistics for Program Analysishttps://ml4code.github.io/publications/hsiao2014using/
On the “Naturalness” of Buggy Codehttps://ml4code.github.io/publications/ray2015naturalness/
Bugram: bug detection with n-gram language modelshttps://ml4code.github.io/publications/wang2016bugram/
Automatically Learning Semantic Features for Defect Predictionhttps://ml4code.github.io/publications/wang2016automatically/
Software Defect Prediction via Convolutional Neural Networkhttps://ml4code.github.io/publications/li2017software/
Deep Learning to Find Bugshttps://ml4code.github.io/publications/pradel2017deep/
Open Vocabulary Learning on Source Code with a Graph-Structured Cachehttps://ml4code.github.io/publications/cvitkovic2018open/
Learning to Represent Programs with Graphshttps://ml4code.github.io/publications/allamanis2018learning/
Exploring the Naturalness of Buggy Code with Recurrent Neural Networkhttps://ml4code.github.io/publications/lanchantin2018exploring/
Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networkshttps://ml4code.github.io/publications/li2019improving/
Scalable Taint Specification Inference with Big Codehttps://ml4code.github.io/publications/chibotaru2019scalable/
Neural Attribution for Semantic Bug-Localization in Student Programshttps://ml4code.github.io/publications/gupta2019neural/
Learning Semantic Program Embeddings with Graph Interval Neural Networkhttps://ml4code.github.io/publications/wang2020learning/
Global Relational Models of Source Codehttps://ml4code.github.io/publications/hellendoorn2020global/
OffSide: Learning to Identify Mistakes in Boundary Conditionshttps://ml4code.github.io/publications/briem2020offside/
SCELMo: Source Code Embeddings from Language Modelshttps://ml4code.github.io/publications/karampatsis2020scelmo/
Self-Supervised Bug Detection and Repairhttps://ml4code.github.io/publications/allamanis2021self/
Co-Training for Commit Classificationhttps://ml4code.github.io/publications/lee2021cotraining/
Deep Learning based Vulnerability Detection: Are We There Yet?https://ml4code.github.io/publications/chakraborty2020deep/
On Distribution Shift in Learning-based Bug Detectorshttps://ml4code.github.io/publications/he2022distribution/
Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptionshttps://ml4code.github.io/publications/bieber2022static/
Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixeshttps://ml4code.github.io/publications/richter2022can/
Large Language Models and Simple, Stupid Bugshttps://ml4code.github.io/publications/jesse2023large/
Predicting Program Properties from “Big Code”https://ml4code.github.io/publications/raychev2015predicting/
Statistical Deobfuscation of Android Applicationshttps://ml4code.github.io/publications/bichsel2016statistical/
Towards Better Program Obfuscation: Optimization via Language Modelshttps://ml4code.github.io/publications/liu2016towards/
Recovering Clear, Natural Identifiers from Obfuscated JS Nameshttps://ml4code.github.io/publications/vasilescu2017recovering/
Recovering Variable Names for Minified Code with Usage Contextshttps://ml4code.github.io/publications/tran2019recovering/
Neural Reverse Engineering of Stripped Binarieshttps://ml4code.github.io/publications/david2019neural/
A Neural Approach to Decompiled Identifier Renaminghttps://ml4code.github.io/publications/lacomis2019neural/
Natural Language Models for Predicting Programming Commentshttps://ml4code.github.io/publications/movshovitz2013natural/
A parallel corpus of Python functions and documentation strings for automated code documentation and code generationhttps://ml4code.github.io/publications/barone2017parallel/
Learning Technical Correspondences in Technical Documentationhttps://ml4code.github.io/publications/richardson2017learning/
Deep Learning to Detect Redundant Method Commentshttps://ml4code.github.io/publications/louis2018deep/
Improving Automatic Source Code Summarization via Deep Reinforcement Learninghttps://ml4code.github.io/publications/wan2018improving/
Structured Neural Summarizationhttps://ml4code.github.io/publications/fernandes2019structured/
A Neural Model for Generating Natural Language Summaries of Program Subroutineshttps://ml4code.github.io/publications/leclair2019neural/
TAG : Type Auxiliary Guiding for Code Comment Generationhttps://ml4code.github.io/publications/cai2020tag/
TranS^3: A Transformer-based Framework for Unifying Code Summarization and Code Searchhttps://ml4code.github.io/publications/wang2020trans/
Deep Just-In-Time Inconsistency Detection Between Comments and Source Codehttps://ml4code.github.io/publications/panthaplackel2020deep/
Code to Comment "Translation": Data, Metrics, Baselining & Evaluationhttps://ml4code.github.io/publications/gros2020code/
Learning to Update Natural Language Comments Based on Code Changeshttps://ml4code.github.io/publications/panthaplackel2020learning/
PyMT5: multi-mode translation of natural language and Python code with transformershttps://ml4code.github.io/publications/clement2020pymt5/
NaturalCC: A Toolkit to Naturalize the Source Code Corpushttps://ml4code.github.io/publications/wan2020naturalcc/
Where should I comment my code? A dataset and model for predicting locations that need commentshttps://ml4code.github.io/publications/louis2020where/
Suggesting Comment Completions for Python using Neural Language Modelshttps://ml4code.github.io/publications/ciurumelea2020suggesting/
Automating Just-In-Time Comment Updatinghttps://ml4code.github.io/publications/liu2020automating/
Learning to Describe Solutions for Bug Reports Based on Developer Discussionshttps://ml4code.github.io/publications/panthaplackel2021learning/
Assemble Foundation Models for Automatic Code Summarizationhttps://ml4code.github.io/publications/jian2022assemble/
LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognitionhttps://ml4code.github.io/publications/sharma2022lamner/
Learning Scalable and Precise Representation of Program Semanticshttps://ml4code.github.io/publications/wang2019learning/
Blended, precise semantic program embeddingshttps://ml4code.github.io/publications/wang2020blended/
Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networkshttps://ml4code.github.io/publications/bieber2020learning/
TraceFixer: Execution Trace-Driven Program Repairhttps://ml4code.github.io/publications/bouzenia2023tracefixer/
Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learninghttps://ml4code.github.io/publications/yadavally2024predictive/
A Study of Repetitiveness of Code Changes in Software Evolutionhttps://ml4code.github.io/publications/nguyen2013study/
Automatically Generating Commit Messages from Diffs using Neural Machine Translationhttps://ml4code.github.io/publications/jiang2017automatically/
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changeshttps://ml4code.github.io/publications/loyola2017neural/
Content Aware Source Code Change Description Generationhttps://ml4code.github.io/publications/loyola2018content/
Learning How to Mutate Source Code from Bug-Fixeshttps://ml4code.github.io/publications/tufano2018learning/
Neural-Machine-Translation-Based Commit Message Generation: How Far Are We?https://ml4code.github.io/publications/liu2018neural/
Graph-based Mining of In-the-Wild, Fine-grained, Semantic Code Change Patternshttps://ml4code.github.io/publications/nguyen2019graph/
On Learning Meaningful Code Changes via Neural Machine Translationhttps://ml4code.github.io/publications/tufano2019learning/
Learning to Fix Build Errors with Graph2Diff Neural Networkshttps://ml4code.github.io/publications/tarlow2019learning/
Generating commit messages from diffs using pointer-generator networkhttps://ml4code.github.io/publications/liu2019generating/
Commit Message Generation for Source Code Changeshttps://ml4code.github.io/publications/xu2019commit/
DeepDelta: Learning to Repair Compilation Errorshttps://ml4code.github.io/publications/mesbah2019deepdelta/
Commit2Vec: Learning Distributed Representations of Code Changeshttps://ml4code.github.io/publications/commit2vec2019lozoya/
Learning to Represent Editshttps://ml4code.github.io/publications/yin2019learning/
Neural Networks for Modeling Source Code Editshttps://ml4code.github.io/publications/zhao2019neural/
DLFix: Context-based Code Transformation Learning for Automated Program Repairhttps://ml4code.github.io/publications/li2020dlfix/
Hoppity: Learning Bug Detection and Repairhttps://ml4code.github.io/publications/dinella2020hoppity/
CC2Vec: Distributed Representations of Code Changeshttps://ml4code.github.io/publications/hoang2020cc2vec/
Graph-based, Self-Supervised Program Repair from Diagnostic Feedbackhttps://ml4code.github.io/publications/yasunaga2020graph/
Copy that! Editing Sequences by Copying Spanshttps://ml4code.github.io/publications/panthaplackel2020copy/
Deep Just-In-Time Inconsistency Detection Between Comments and Source Codehttps://ml4code.github.io/publications/panthaplackel2020deep/
A Structural Model for Contextual Code Changeshttps://ml4code.github.io/publications/brody2020structural/
Learning to Update Natural Language Comments Based on Code Changeshttps://ml4code.github.io/publications/panthaplackel2020learning/
Unsupervised Learning of General-Purpose Embeddings for Code Changeshttps://ml4code.github.io/publications/pravilov2021unsupervised/
Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Sizehttps://ml4code.github.io/publications/monperrus2021megadiff/
A Semantic Bug Seeding: A Learning-Based Approach for Creating Realistic Bugshttps://ml4code.github.io/publications/patra2021semantic/
Jointly Learning to Repair Code and Generate Commit Messagehttps://ml4code.github.io/publications/bai2021jointly/
DeepMerge: Learning to Merge Programshttps://ml4code.github.io/publications/dinella2021deepmerge/
On Multi-Modal Learning of Editing Source Codehttps://ml4code.github.io/publications/chakraborty2021multimodal/
A Syntax-Guided Edit Decoder for Neural Program Repairhttps://ml4code.github.io/publications/zhu2921syntax/
Learning to Model Editing Processeshttps://ml4code.github.io/publications/reid2022learning/
CoditT5: Pretraining for Source Code and Natural Language Editinghttps://ml4code.github.io/publications/zhang2022coditt5/
Grace: Language Models Meet Code Editshttps://ml4code.github.io/publications/gupta2023grace/
Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructionshttps://ml4code.github.io/publications/cassano2023can/
A system to grade computer programming skills using machine learninghttps://ml4code.github.io/publications/srikant2014system/
Learning Program Embeddings to Propagate Feedback on Student Codehttps://ml4code.github.io/publications/piech2015learning/
Question Independent Grading using Machine Learning: The Case of Computer Program Gradinghttps://ml4code.github.io/publications/singh2016question/
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedbackhttps://ml4code.github.io/publications/wu2021prototransformer/
Open-ended Knowledge Tracinghttps://ml4code.github.io/publications/liu2022open/
Testing Neural Program Analyzershttps://ml4code.github.io/publications/rabin2019testing/
The Adverse Effects of Code Duplication in Machine Learning Models of Codehttps://ml4code.github.io/publications/allamanis2019adverse/
Towards Demystifying Dimensions of Source Code Embeddingshttps://ml4code.github.io/publications/rabin2020demystifying/
CodeBLEU: a Method for Automatic Evaluation of Code Synthesishttps://ml4code.github.io/publications/ren2020codebleu/
On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformationshttps://ml4code.github.io/publications/rabin2021generalizability/
Impact of Evaluation Methodologies on Code Summarizationhttps://ml4code.github.io/publications/nie2021evaluation/
Memorization and Generalization in Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022memorization/
An Extensive Study on Pre-trained Models for Program Understanding and Generationhttps://ml4code.github.io/publications/zeng2022extensive/
Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysishttps://ml4code.github.io/publications/naik2022probing/
Semantic Similarity Metrics for Evaluating Source Code Summarizationhttps://ml4code.github.io/publications/haque2022semantic/
Human perceiving behavior modeling in evaluation of code generation modelshttps://ml4code.github.io/publications/kovalchuk2022human/
Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsinghttps://ml4code.github.io/publications/patil2022exploring/
Natural Language to Code Generation in Interactive Data Science Notebookshttps://ml4code.github.io/publications/yin2022natural/
CrystalBLEU: Precisely and Efficiently Measuring the Similarity of Codehttps://ml4code.github.io/publications/eghbali2022crystalbleu/
Productivity Assessment of Neural Code Completionhttps://ml4code.github.io/publications/ziegler2022productivity/
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Codehttps://ml4code.github.io/publications/zhou2022codebertscore/
Test-based and metric-based evaluation of code generation models for practical question answeringhttps://ml4code.github.io/publications/kovalchuk2023test/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
CodeScore: Evaluating Code Generation by Learning Code Executionhttps://ml4code.github.io/publications/dong2023codescore/
PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Modelshttps://ml4code.github.io/publications/chen2024ppm/
LLM4Decompile: Decompiling Binary Code with Large Language Modelshttps://ml4code.github.io/publications/tan2024llm4decompile/
Learning to Executehttps://ml4code.github.io/publications/zaremba2014learning/
Show Your Work: Scratchpads for Intermediate Computation with Language Modelshttps://ml4code.github.io/publications/nye2021show/
SelfAPR: Self-supervised Program Repair with Test Execution Diagnosticshttps://ml4code.github.io/publications/ye2022selfapr/
CodeT: Code Generation with Generated Testshttps://ml4code.github.io/publications/chen2022codet/
Code Execution with Pre-trained Language Modelshttps://ml4code.github.io/publications/liu2023code/
LExecutor: Learning-Guided Executionhttps://ml4code.github.io/publications/souza2023lexecutor/
Exploring the Use of Deep Learning for Feature Locationhttps://ml4code.github.io/publications/corley2015exploring/
Learning to Fuzz: Application-Independent Fuzz Testing with Probabilistic, Generative Models of Input Datahttps://ml4code.github.io/publications/patra2016learning/
Compiler Fuzzing through Deep Learninghttps://ml4code.github.io/publications/cummins2018compiler/
NEUZZ: Efficient Fuzzing with Neural Program Smoothinghttps://ml4code.github.io/publications/she2019neuzz/
DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testinghttps://ml4code.github.io/publications/liu2019deepfuzz/
Learning to Fuzz from Symbolic Execution with Application to Smart Contractshttps://ml4code.github.io/publications/he2019learning/
Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzerhttps://ml4code.github.io/publications/lee2020montage/
Universal Fuzzing via Large Language Modelshttps://ml4code.github.io/publications/xia2023universal/
On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformationshttps://ml4code.github.io/publications/rabin2021generalizability/
Memorization and Generalization in Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022memorization/
Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsinghttps://ml4code.github.io/publications/patil2022exploring/
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generationhttps://ml4code.github.io/publications/li2023think/
Gated Graph Sequence Neural Networkshttps://ml4code.github.io/publications/li2016gated/
Open Vocabulary Learning on Source Code with a Graph-Structured Cachehttps://ml4code.github.io/publications/cvitkovic2018open/
Learning to Represent Programs with Graphshttps://ml4code.github.io/publications/allamanis2018learning/
Simulating Execution Time of Tensor Programs using Graph Neural Networkshttps://ml4code.github.io/publications/tomczak2019simulating/
Generative Code Modeling with Graphshttps://ml4code.github.io/publications/brockschmidt2019generative/
Structured Neural Summarizationhttps://ml4code.github.io/publications/fernandes2019structured/
Neural Reverse Engineering of Stripped Binarieshttps://ml4code.github.io/publications/david2019neural/
Using GGNN to recommend log statement levelhttps://ml4code.github.io/publications/li2019using/
Program Classification Using Gated Graph Attention Neural Network for Online Programming Servicehttps://ml4code.github.io/publications/lu2019program/
AutoPandas: neural-backed generators for program synthesishttps://ml4code.github.io/publications/bavishi2019autopandas/
Learning to Fuzz from Symbolic Execution with Application to Smart Contractshttps://ml4code.github.io/publications/he2019learning/
Inferring Javascript types using Graph Neural Networkshttps://ml4code.github.io/publications/schrouff2019inferring/
Learning Semantic Program Embeddings with Graph Interval Neural Networkhttps://ml4code.github.io/publications/wang2020learning/
Global Relational Models of Source Codehttps://ml4code.github.io/publications/hellendoorn2020global/
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networkshttps://ml4code.github.io/publications/zhou2019devign/
LambdaNet: Probabilistic Type Inference using Graph Neural Networkshttps://ml4code.github.io/publications/wei2020lambdanet/
Graph-based, Self-Supervised Program Repair from Diagnostic Feedbackhttps://ml4code.github.io/publications/yasunaga2020graph/
Typilus: Neural Type Hintshttps://ml4code.github.io/publications/allamanis2020typilus/
Learning Graph Structure With A Finite-State Automaton Layerhttps://ml4code.github.io/publications/johnson2020learning/
ProGraML: Graph-based Deep Learning for Program Optimization and Analysishttps://ml4code.github.io/publications/cummins2020programl/
Towards Learning Representations of Binary Executable Files for Security Taskshttps://ml4code.github.io/publications/arakelyan2020towards/
funcGNN: A Graph Neural Network Approach to Program Similarityhttps://ml4code.github.io/publications/nair2020funcgnn/
Deep Graph Matching and Searching for Semantic Code Retrievalhttps://ml4code.github.io/publications/ling2020deep/
ComPy-Learn: A toolbox for exploring machine learning representations for compilershttps://ml4code.github.io/publications/brauckmann2020compy/
Compiler-based graph representations for deep learning models of codehttps://ml4code.github.io/publications/brauckmann2020compiler/
Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Treehttps://ml4code.github.io/publications/wang2020detecting/
Modeling Functional Similarity in Source Code with Graph-Based Siamese Networkshttps://ml4code.github.io/publications/mehrotra2020modeling/
Learning to Represent Programs with Heterogeneous Graphshttps://ml4code.github.io/publications/wang2020learning2/
Self-Supervised Bug Detection and Repairhttps://ml4code.github.io/publications/allamanis2021self/
Structured Statistical Syntax Tree Predictionhttps://ml4code.github.io/publications/omar2013structured/
Building Program Vector Representations for Deep Learninghttps://ml4code.github.io/publications/mou2014building/
Structured Generative Models of Natural Source Codehttps://ml4code.github.io/publications/maddison2014structured/
Mining Idioms from Source Codehttps://ml4code.github.io/publications/allamanis2014mining/
Learning to Generate Pseudo-code from Source Code using Statistical Machine Translationhttps://ml4code.github.io/publications/oda2015learning/
A Bimodal Modelling of Source Code and Natural Languagehttps://ml4code.github.io/publications/allamanis2015bimodal/
Learning Programs from Noisy Datahttps://ml4code.github.io/publications/raychev2016learning/
PHOG: Probabilistic Model for Codehttps://ml4code.github.io/publications/bielik2016phog/
Convolutional Neural Networks over Tree Structures for Programming Language Processinghttps://ml4code.github.io/publications/mou2016convolutional/
A Syntactic Neural Model for General-Purpose Code Generationhttps://ml4code.github.io/publications/yin2017syntactic/
Neural Attribute Machines for Program Generationhttps://ml4code.github.io/publications/amodio2017neural/
Abstract Syntax Networks for Code Generation and Semantic Parsinghttps://ml4code.github.io/publications/rabinovich2017abstract/
Mining Semantic Loop Idioms from Big Codehttps://ml4code.github.io/publications/allamanis2017mining/
Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networkshttps://ml4code.github.io/publications/bui2018cross/
CODIT: Code Editing with Tree-Based Neural Machine Translationhttps://ml4code.github.io/publications/chakraborty2018tree2tree/
A Grammar-Based Structural CNN Decoder for Code Generationhttps://ml4code.github.io/publications/sun2019grammar/
Capturing source code semantics via tree-based convolution over API-enhanced ASThttps://ml4code.github.io/publications/chen2019capturing/
Generative Code Modeling with Graphshttps://ml4code.github.io/publications/brockschmidt2019generative/
PathMiner : A Library for Mining of Path-Based Representations of Codehttps://ml4code.github.io/publications/kovalenko2019pathminer/
Learning Programmatic Idioms for Scalable Semantic Parsinghttps://ml4code.github.io/publications/iyer2019learning/
Automatic Source Code Summarization with Extended Tree-LSTMhttps://ml4code.github.io/publications/shido2019automatic/
Learning-based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detectionhttps://ml4code.github.io/publications/buech2019learning/
A Novel Neural Source Code Representation based on Abstract Syntax Treehttps://ml4code.github.io/publications/zhang2019novel/
Neural-Network Guided Expression Transformationhttps://ml4code.github.io/publications/edelmann2019neural/
DLFix: Context-based Code Transformation Learning for Automated Program Repairhttps://ml4code.github.io/publications/li2020dlfix/
Modular Tree Network for Source Code Representation Learninghttps://ml4code.github.io/publications/wang2020modular/
PSCS: A Path-based Neural Model for Semantic Code Searchhttps://ml4code.github.io/publications/sun2020pscs/
A Structural Model for Contextual Code Changeshttps://ml4code.github.io/publications/brody2020structural/
Predicting Vulnerability in Large Codebases With Deep Code Representationhttps://ml4code.github.io/publications/ashwath2020predicting/
TreeBERT: A Tree-Based Pre-Trained Model for Programming Languagehttps://ml4code.github.io/publications/jiang2021treebert/
Learning to Complete Code with Sketcheshttps://ml4code.github.io/publications/guo2022learning/
Grounded Copilot: How Programmers Interact with Code-Generating Modelshttps://ml4code.github.io/publications/barke2022grounded/
Semantic Similarity Metrics for Evaluating Source Code Summarizationhttps://ml4code.github.io/publications/haque2022semantic/
Human perceiving behavior modeling in evaluation of code generation modelshttps://ml4code.github.io/publications/kovalchuk2022human/
Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Modelshttps://ml4code.github.io/publications/vaithilingam2022expectation/
What is it like to program with artificial intelligence?https://ml4code.github.io/publications/sarkar2022what/
Productivity Assessment of Neural Code Completionhttps://ml4code.github.io/publications/ziegler2022productivity/
A Hidden Markov Model to Detect Coded Information Islands in Free Texthttps://ml4code.github.io/publications/cerulo2013hidden/
Irish: A Hidden Markov Model to detect coded information islands in free texthttps://ml4code.github.io/publications/cerulo2015irish/
NIRMAL: Automatic Identification of Software Relevant Tweets Leveraging Language Modelhttps://ml4code.github.io/publications/sharma2015nirmal/
Extracting Code from Programming Tutorial Videoshttps://ml4code.github.io/publications/yadid2016extracting/
A Deep Learning Approach to Identifying Source Code in Images and Videohttps://ml4code.github.io/publications/ott2018deep/
Evaluation of Type Inference with Textual Cueshttps://ml4code.github.io/publications/shirani2018evaluation/
Code and Named Entity Recognition in StackOverflowhttps://ml4code.github.io/publications/tabassum2020code/
Understanding Neural Code Intelligence Through Program Simplificationhttps://ml4code.github.io/publications/rabin2021understanding/
OctoPack: Instruction Tuning Code Large Language Modelshttps://ml4code.github.io/publications/muennighoff2023octopack/
Towards Demystifying Dimensions of Source Code Embeddingshttps://ml4code.github.io/publications/rabin2020demystifying/
Understanding Neural Code Intelligence Through Program Simplificationhttps://ml4code.github.io/publications/rabin2021understanding/
Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022understanding/
Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysishttps://ml4code.github.io/publications/naik2022probing/
An Exploratory Study on Code Attention in BERThttps://ml4code.github.io/publications/sharma2022exploratory/
On the Naturalness of Softwarehttps://ml4code.github.io/publications/hindle2012naturalness/
A Statistical Semantic Language Model for Source Codehttps://ml4code.github.io/publications/nguyen2013statistical/
Mining Source Code Repositories at Massive Scale Using Language Modeling https://ml4code.github.io/publications/allamanis2013mining/
Structured Statistical Syntax Tree Predictionhttps://ml4code.github.io/publications/omar2013structured/
Learning Natural Coding Conventionshttps://ml4code.github.io/publications/allamanis2014learning/
Structured Generative Models of Natural Source Codehttps://ml4code.github.io/publications/maddison2014structured/
Code Completion with Statistical Language Modelshttps://ml4code.github.io/publications/raychev2014code/
On the Localness of Softwarehttps://ml4code.github.io/publications/tu2014localness/
Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Modelshttps://ml4code.github.io/publications/campbell2014syntax/
Will they like this? Evaluating Code Contributions With Language Modelshttps://ml4code.github.io/publications/hellendoorn2015will/
Graph-based Statistical Language Model for Codehttps://ml4code.github.io/publications/nguyen2015graph/
Products, Developers, and Milestones: How Should I Build My N-Gram Language Modelhttps://ml4code.github.io/publications/saraiva2015products/
Visualizing and Understanding Recurrent Networkshttps://ml4code.github.io/publications/karpathy2015visualizing/
CACHECA: A Cache Language Model Based Code Suggestion Toolhttps://ml4code.github.io/publications/franks2015cacheca/
A deep language model for software codehttps://ml4code.github.io/publications/dam2016deep/
PHOG: Probabilistic Model for Codehttps://ml4code.github.io/publications/bielik2016phog/
Learning Python Code Suggestion with a Sparse Pointer Networkhttps://ml4code.github.io/publications/bhoopchand2016learning/
A Language Model for Statements of Software Codehttps://ml4code.github.io/publications/yang2017language/
Are Deep Neural Networks the Best Choice for Modeling Source Code?https://ml4code.github.io/publications/hellendoorn2017deep/
Code Completion with Neural Attention and Pointer Networkshttps://ml4code.github.io/publications/li2017code/
Building Language Models for Text with Named Entitieshttps://ml4code.github.io/publications/parvez2018building/
Exploring the Naturalness of Buggy Code with Recurrent Neural Networkhttps://ml4code.github.io/publications/lanchantin2018exploring/
Syntax and Sensibility: Using language models to detect and correct syntax errorshttps://ml4code.github.io/publications/santos2018syntax/
On the Impact of Refactoring Operations on Code Naturalnesshttps://ml4code.github.io/publications/lin2019impact/
Pythia: AI-assisted Code Completion Systemhttps://ml4code.github.io/publications/svyatkovskiy2019pythia/
Maybe Deep Neural Networks are the Best Choice for Modeling Source Codehttps://ml4code.github.io/publications/karampatsis2019deep/
Big Code != Big Vocabulary: Open-Vocabulary Models for Source Codehttps://ml4code.github.io/publications/karampatsis2020big/
PyMT5: multi-mode translation of natural language and Python code with transformershttps://ml4code.github.io/publications/clement2020pymt5/
Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzerhttps://ml4code.github.io/publications/lee2020montage/
IntelliCode Compose: Code Generation Using Transformerhttps://ml4code.github.io/publications/svyatkovskiy2020intellicode/
On-the-Fly Adaptation of Source Code Models using Meta-Learninghttps://ml4code.github.io/publications/shrivastava2020on-the-fly/
CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Modelhttps://ml4code.github.io/publications/jung2021commitbert/
Evaluating Large Language Models Trained on Codehttps://ml4code.github.io/publications/chen2021evaluating/
Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Modelshttps://ml4code.github.io/publications/li2021toward/
An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributionshttps://ml4code.github.io/publications/pearce2021empirical/
Capturing Structural Locality in Non-parametric Language Modelshttps://ml4code.github.io/publications/xu2021capturing/
Exploration of Convolutional Neural Network models for source code classificationhttps://ml4code.github.io/publications/barchi2021exploration/
Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchyhttps://ml4code.github.io/publications/clement2021long/
Time-Efficient Code Completion Model for the R Programming Languagehttps://ml4code.github.io/publications/popov2021time/
On the Naturalness and Localness of Software Logshttps://ml4code.github.io/publications/gholamian2021naturalness/
Neural Program Generation Modulo Static Analysishttps://ml4code.github.io/publications/mukherjee2021neural/
Memorization and Generalization in Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022memorization/
Efficient Training of Language Models to Fill in the Middlehttps://ml4code.github.io/publications/bavarian2022efficient/
Assemble Foundation Models for Automatic Code Summarizationhttps://ml4code.github.io/publications/jian2022assemble/
A Systematic Evaluation of Large Language Models of Codehttps://ml4code.github.io/publications/xu2022systematic/
Synchromesh: Reliable code generation from pre-trained language modelshttps://ml4code.github.io/publications/poesia2022synchromesh/
Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mappinghttps://ml4code.github.io/publications/parisi2022making/
Bridging Pre-trained Models and Downstream Tasks for Source Code Understandinghttps://ml4code.github.io/publications/deze2022bridging/
Learning to Complete Code with Sketcheshttps://ml4code.github.io/publications/guo2022learning/
Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysishttps://ml4code.github.io/publications/naik2022probing/
LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognitionhttps://ml4code.github.io/publications/sharma2022lamner/
An Exploratory Study on Code Attention in BERThttps://ml4code.github.io/publications/sharma2022exploratory/
Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Modelshttps://ml4code.github.io/publications/vaithilingam2022expectation/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippetshttps://ml4code.github.io/publications/lomshakov2023fine/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
(Partial) Program Dependence Learninghttps://ml4code.github.io/publications/yadavally2023partial/
Can Large Language Model Detect Plagiarism in Source Code?https://ml4code.github.io/publications/brach2024can/
LLM4Decompile: Decompiling Binary Code with Large Language Modelshttps://ml4code.github.io/publications/tan2024llm4decompile/
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Searchhttps://ml4code.github.io/publications/li2024rewriting/
A Learning-Based Approach to Static Program Slicinghttps://ml4code.github.io/publications/yadavally2024learning/
Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learninghttps://ml4code.github.io/publications/yadavally2024predictive/
A Static Evaluation of Code Completion by Large Language Modelshttps://ml4code.github.io/publications/ding2023static/
Can Large Language Model Detect Plagiarism in Source Code?https://ml4code.github.io/publications/brach2024can/
LLM4Decompile: Decompiling Binary Code with Large Language Modelshttps://ml4code.github.io/publications/tan2024llm4decompile/
Using GGNN to recommend log statement levelhttps://ml4code.github.io/publications/li2019using/
On the Naturalness and Localness of Software Logshttps://ml4code.github.io/publications/gholamian2021naturalness/
Using Deep Learning to Generate Complete Log Statementshttps://ml4code.github.io/publications/mastropaolo2022using/
Memorization and Generalization in Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022memorization/
Test-based and metric-based evaluation of code generation models for practical question answeringhttps://ml4code.github.io/publications/kovalchuk2023test/
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Searchhttps://ml4code.github.io/publications/li2024rewriting/
Lexical Statistical Machine Translation for Language Migrationhttps://ml4code.github.io/publications/nguyen2013lexical/
Statistical Learning Approach for Mining API Usage Mappings for Code Migrationhttps://ml4code.github.io/publications/nguyen2014statistical/
Divide-and-Conquer Approach for Multi-phase Statistical Migration for Source Codehttps://ml4code.github.io/publications/nguyen2015divide/
Phrase-Based Statistical Translation of Programming Languageshttps://ml4code.github.io/publications/karaivanov2014phrase/
Using Machine Translation for Converting Python 2 to Python 3 Codehttps://ml4code.github.io/publications/aggarwal2015using/
Mapping API Elements for Code Migration with Vector Representationshttps://ml4code.github.io/publications/nguyen2016mapping/
Unsupervised Translation of Programming Languageshttps://ml4code.github.io/publications/lachaux2020unsupervised/
Leveraging Automated Unit Tests for Unsupervised Code Translationhttps://ml4code.github.io/publications/roziere2021leveraging/
Code Translation with Compiler Representationshttps://ml4code.github.io/publications/szafraniec2022code/
Learning Natural Coding Conventionshttps://ml4code.github.io/publications/allamanis2014learning/
Predicting Program Properties from “Big Code”https://ml4code.github.io/publications/raychev2015predicting/
Suggesting Accurate Method and Class Nameshttps://ml4code.github.io/publications/allamanis2015suggesting/
A Convolutional Attention Network for Extreme Summarization of Source Codehttps://ml4code.github.io/publications/allamanis2016convolutional/
Statistical Deobfuscation of Android Applicationshttps://ml4code.github.io/publications/bichsel2016statistical/
Recovering Clear, Natural Identifiers from Obfuscated JS Nameshttps://ml4code.github.io/publications/vasilescu2017recovering/
Context2Name: A Deep Learning-Based Approach to Infer Natural Variable Names from Usage Contextshttps://ml4code.github.io/publications/bavishi2017context2name/
Learning to Represent Programs with Graphshttps://ml4code.github.io/publications/allamanis2018learning/
A General Path-Based Representation for Predicting Program Propertieshttps://ml4code.github.io/publications/alon2018general/
code2vec: Learning Distributed Representations of Codehttps://ml4code.github.io/publications/alon2019code2vec/
A Neural Model for Method Name Generation from Functional Descriptionhttps://ml4code.github.io/publications/gao2019neural/
Recovering Variable Names for Minified Code with Usage Contextshttps://ml4code.github.io/publications/tran2019recovering/
Mercem: Method Name Recommendation Based on Call Graph Embeddinghttps://ml4code.github.io/publications/yonai2019mercem/
Learning to Sport and Refactor Inconsistent Method Nameshttps://ml4code.github.io/publications/liu2019learning/
code2seq: Generating Sequences from Structured Representations of Codehttps://ml4code.github.io/publications/alon2018code2seq/
Method name suggestion with hierarchical attention networkshttps://ml4code.github.io/publications/xu2019method/
Neural Reverse Engineering of Stripped Binarieshttps://ml4code.github.io/publications/david2019neural/
A Neural Approach to Decompiled Identifier Renaminghttps://ml4code.github.io/publications/lacomis2019neural/
Suggesting Natural Method Names to Check Name Consistencieshttps://ml4code.github.io/publications/nguyen2020suggesting/
Towards Demystifying Dimensions of Source Code Embeddingshttps://ml4code.github.io/publications/rabin2020demystifying/
Embedding Java Classes with code2vec: Improvements from Variable Obfuscationhttps://ml4code.github.io/publications/compton2020embedding/
Semantic Robustness of Models of Source Codehttps://ml4code.github.io/publications/henkel2020semantic/
InCoder: A Generative Model for Code Infilling and Synthesishttps://ml4code.github.io/publications/fried2022incoder/
Test-based and metric-based evaluation of code generation models for practical question answeringhttps://ml4code.github.io/publications/kovalchuk2023test/
Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IRhttps://ml4code.github.io/publications/barchi2019code/
Test-based and metric-based evaluation of code generation models for practical question answeringhttps://ml4code.github.io/publications/kovalchuk2023test/
Can Large Language Model Detect Plagiarism in Source Code?https://ml4code.github.io/publications/brach2024can/
Natural Language to Code Generation in Interactive Data Science Notebookshttps://ml4code.github.io/publications/yin2022natural/
End-to-end Deep Learning of Optimization Heuristicshttps://ml4code.github.io/publications/cummins2017end/
Synthesizing benchmarks for predictive modelinghttps://ml4code.github.io/publications/cummins2017synthesizing/
Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IRhttps://ml4code.github.io/publications/barchi2019code/
Neural-Network Guided Expression Transformationhttps://ml4code.github.io/publications/edelmann2019neural/
ComPy-Learn: A toolbox for exploring machine learning representations for compilershttps://ml4code.github.io/publications/brauckmann2020compy/
Compiler-based graph representations for deep learning models of codehttps://ml4code.github.io/publications/brauckmann2020compiler/
Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Modelshttps://ml4code.github.io/publications/li2021toward/
Exploration of Convolutional Neural Network models for source code classificationhttps://ml4code.github.io/publications/barchi2021exploration/
Source Code Classification for Energy Efficiency in Parallel Ultra Low-Power Microcontrollershttps://ml4code.github.io/publications/parisi2021source/
Deep Learning Approaches to Source Code Analysis for Optimization of Heterogeneous Systems: Recent Results, Challenges and Opportunitieshttps://ml4code.github.io/publications/barchi2022deep/
Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mappinghttps://ml4code.github.io/publications/parisi2022making/
DeepPERF: A Deep Learning-Based Approach For Improving Software Performancehttps://ml4code.github.io/publications/garg2022deepperf/
Supersonic: Learning to Generate Source Code Optimizations in C/C++https://ml4code.github.io/publications/chen2023supersonic/
Rethinking Negative Pairs in Code Searchhttps://ml4code.github.io/publications/li2023rethinking/
Mining Idioms from Source Codehttps://ml4code.github.io/publications/allamanis2014mining/
KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Factshttps://ml4code.github.io/publications/movshovitz2015kb/
Parameter-Free Probabilistic API Mining across GitHubhttps://ml4code.github.io/publications/fowkes2016parameter/
Mining Semantic Loop Idioms from Big Codehttps://ml4code.github.io/publications/allamanis2017mining/
Topic modeling of public repositories at scale using names in source codehttps://ml4code.github.io/publications/markovtsev2017topic/
Graph-based Mining of In-the-Wild, Fine-grained, Semantic Code Change Patternshttps://ml4code.github.io/publications/nguyen2019graph/
Learning Programmatic Idioms for Scalable Semantic Parsinghttps://ml4code.github.io/publications/iyer2019learning/
Mining Idioms in the Wildhttps://ml4code.github.io/publications/sivaraman2021mining/
Can Large Language Model Detect Plagiarism in Source Code?https://ml4code.github.io/publications/brach2024can/
Deep Transfer Learning for Source Code Modelinghttps://ml4code.github.io/publications/hussain2019deep/
GraphCodeBERT: Pre-training Code Representations with Data Flowhttps://ml4code.github.io/publications/guo2020graphcodebert/
PyMT5: multi-mode translation of natural language and Python code with transformershttps://ml4code.github.io/publications/clement2020pymt5/
IntelliCode Compose: Code Generation Using Transformerhttps://ml4code.github.io/publications/svyatkovskiy2020intellicode/
Pre-trained Contextual Embedding of Source Codehttps://ml4code.github.io/publications/kanade2020pretrained/
CodeBERT: A Pre-Trained Model for Programming and Natural Languageshttps://ml4code.github.io/publications/feng2020codebert/
Contrastive Code Representation Learninghttps://ml4code.github.io/publications/jain2020contrastive/
SCELMo: Source Code Embeddings from Language Modelshttps://ml4code.github.io/publications/karampatsis2020scelmo/
Contrastive Learning for Source Code with Structural and Functional Propertieshttps://ml4code.github.io/publications/ding2021contrastive/
DOBF: A Deobfuscation Pre-Training Objective for Programming Languageshttps://ml4code.github.io/publications/roziere2021dobf/
SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representationhttps://ml4code.github.io/publications/wang2021syncobert/
Unified Pre-training for Program Understanding and Generationhttps://ml4code.github.io/publications/ahmad2021unified/
Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformationshttps://ml4code.github.io/publications/bui2021efficient/
An Exploratory Study on Code Attention in BERThttps://ml4code.github.io/publications/sharma2022exploratory/
What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Codehttps://ml4code.github.io/publications/wan2022what/
A Factor Graph Model for Software Bug Findinghttps://ml4code.github.io/publications/kremenek2007factor/
Predicting Program Properties from “Big Code”https://ml4code.github.io/publications/raychev2015predicting/
A User-Guided Approach to Program Analysishttps://ml4code.github.io/publications/mangal2015user/
Learning a Strategy for Adapting a Program Analysis via Bayesian Optimisationhttps://ml4code.github.io/publications/oh2015learning/
Gated Graph Sequence Neural Networkshttps://ml4code.github.io/publications/li2016gated/
Deep Learning to Find Bugshttps://ml4code.github.io/publications/pradel2017deep/
Finding Likely Errors with Bayesian Specificationshttps://ml4code.github.io/publications/murali2017finding/
User-guided program reasoning using Bayesian inferencehttps://ml4code.github.io/publications/raghothaman2018user/
Path-Based Function Embedding and its Application to Specification Mininghttps://ml4code.github.io/publications/defreez2018path/
Neural-Augumented Static Analysis of Android Communicationhttps://ml4code.github.io/publications/zhao2018neural/
Learning Loop Invariants for Program Verificationhttps://ml4code.github.io/publications/si2018learning/
RefiNym: Using Names to Refine Typeshttps://ml4code.github.io/publications/dash2018refinym/
Automated Vulnerability Detection in Source Code Using Deep Representation Learninghttps://ml4code.github.io/publications/russell2018automated/
Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IRhttps://ml4code.github.io/publications/barchi2019code/
On the Feasibility of Transfer-learning Code Smells using Deep Learninghttps://ml4code.github.io/publications/sharma2019feasibility/
Unsupervised Learning of API Aliasing Specificationshttps://ml4code.github.io/publications/ederhardt2019unsupervised/
Scalable Taint Specification Inference with Big Codehttps://ml4code.github.io/publications/chibotaru2019scalable/
Neural Bug Finding: A Study of Opportunities and Challengeshttps://ml4code.github.io/publications/habib2019neural/
Neural Program Repair by Jointly Learning to Localize and Repairhttps://ml4code.github.io/publications/vasic2019neural/
Inferring Javascript types using Graph Neural Networkshttps://ml4code.github.io/publications/schrouff2019inferring/
Neural Software Analysishttps://ml4code.github.io/publications/pradel2020neural/
Learning Graph Structure With A Finite-State Automaton Layerhttps://ml4code.github.io/publications/johnson2020learning/
Predicting Vulnerability in Large Codebases With Deep Code Representationhttps://ml4code.github.io/publications/ashwath2020predicting/
SinkFinder: harvesting hundreds of unknown interesting function pairs with just one seedhttps://ml4code.github.io/publications/bian2020sinkfinder/
Exploration of Convolutional Neural Network models for source code classificationhttps://ml4code.github.io/publications/barchi2021exploration/
Source Code Classification for Energy Efficiency in Parallel Ultra Low-Power Microcontrollershttps://ml4code.github.io/publications/parisi2021source/
Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mappinghttps://ml4code.github.io/publications/parisi2022making/
What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Codehttps://ml4code.github.io/publications/wan2022what/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
(Partial) Program Dependence Learninghttps://ml4code.github.io/publications/yadavally2023partial/
A Learning-Based Approach to Static Program Slicinghttps://ml4code.github.io/publications/yadavally2024learning/
Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learninghttps://ml4code.github.io/publications/yadavally2024predictive/
Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippetshttps://ml4code.github.io/publications/lomshakov2023fine/
Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippetshttps://ml4code.github.io/publications/lomshakov2023fine/
Testing Neural Program Analyzershttps://ml4code.github.io/publications/rabin2019testing/
Mercem: Method Name Recommendation Based on Call Graph Embeddinghttps://ml4code.github.io/publications/yonai2019mercem/
On the Impact of Refactoring Operations on Code Naturalnesshttps://ml4code.github.io/publications/lin2019impact/
Recommendation of Move Method Refactoring Using Path-Based Representation of Codehttps://ml4code.github.io/publications/kurbatova2020recommendation/
Understanding Neural Code Intelligence Through Program Simplificationhttps://ml4code.github.io/publications/rabin2021understanding/
On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformationshttps://ml4code.github.io/publications/rabin2021generalizability/
Mining Idioms in the Wildhttps://ml4code.github.io/publications/sivaraman2021mining/
Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022understanding/
Memorization and Generalization in Neural Code Intelligence Modelshttps://ml4code.github.io/publications/rabin2022memorization/
Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Modelshttps://ml4code.github.io/publications/campbell2014syntax/
Learning Program Embeddings to Propagate Feedback on Student Codehttps://ml4code.github.io/publications/piech2015learning/
OverCode: visualizing variation in student solutions to programming problems at scalehttps://ml4code.github.io/publications/glassman2015overcode/
Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networkshttps://ml4code.github.io/publications/bhatia2016automated/
sk_p: a neural program corrector for MOOCshttps://ml4code.github.io/publications/pu2016skp/
Semantic Code Repair using Neuro-Symbolic Transformation Networkshttps://ml4code.github.io/publications/devlin2017semantic/
DeepFix: Fixing Common C Language Errors by Deep Learninghttps://ml4code.github.io/publications/gupta2017deepfix/
Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similaritieshttps://ml4code.github.io/publications/white2017sorting/
An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translationhttps://ml4code.github.io/publications/tufano2018empirical/
Deep Reinforcement Learning for Programming Language Correctionhttps://ml4code.github.io/publications/gupta2018deep/
Learning How to Mutate Source Code from Bug-Fixeshttps://ml4code.github.io/publications/tufano2018learning/
CODIT: Code Editing with Tree-Based Neural Machine Translationhttps://ml4code.github.io/publications/chakraborty2018tree2tree/
Learning to Generate Corrective Patches using Neural Machine Translationhttps://ml4code.github.io/publications/hata2018learning/
Learning to Repair Software Vulnerabilities with Generative Adversarial Networkshttps://ml4code.github.io/publications/harer2018learning/
Neuro-symbolic program corrector for introductory programming assignmentshttps://ml4code.github.io/publications/bhatia2018neurosymbolic/
Syntax and Sensibility: Using language models to detect and correct syntax errorshttps://ml4code.github.io/publications/santos2018syntax/
SampleFix: Learning to Correct Programs by Sampling Diverse Fixeshttps://ml4code.github.io/publications/hajipour2019samplefix/
On Learning Meaningful Code Changes via Neural Machine Translationhttps://ml4code.github.io/publications/tufano2019learning/
SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repairhttps://ml4code.github.io/publications/chen2019sequencer/
Learning to Fix Build Errors with Graph2Diff Neural Networkshttps://ml4code.github.io/publications/tarlow2019learning/
DeepDelta: Learning to Repair Compilation Errorshttps://ml4code.github.io/publications/mesbah2019deepdelta/
Neural Program Repair by Jointly Learning to Localize and Repairhttps://ml4code.github.io/publications/vasic2019neural/
Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repairhttps://ml4code.github.io/publications/tian2020evaluating/
DLFix: Context-based Code Transformation Learning for Automated Program Repairhttps://ml4code.github.io/publications/li2020dlfix/
Hoppity: Learning Bug Detection and Repairhttps://ml4code.github.io/publications/dinella2020hoppity/
Graph-based, Self-Supervised Program Repair from Diagnostic Feedbackhttps://ml4code.github.io/publications/yasunaga2020graph/
Self-Supervised Bug Detection and Repairhttps://ml4code.github.io/publications/allamanis2021self/
Learning to Find Naming Issues with Big Code and Small Supervisionhttps://ml4code.github.io/publications/he2021learning/
A Semantic Bug Seeding: A Learning-Based Approach for Creating Realistic Bugshttps://ml4code.github.io/publications/patra2021semantic/
Neural Program Repair with Execution-based Backpropagationhttps://ml4code.github.io/publications/ye2021neural/
Fix-Filter-Fix: Intuitively Connect Any Models for Effective Bug Fixinghttps://ml4code.github.io/publications/hong2021fix/
DeepMerge: Learning to Merge Programshttps://ml4code.github.io/publications/dinella2021deepmerge/
Learning to Extend Program Graphs to Work-in-Progress Codehttps://ml4code.github.io/publications/li2021learning/
DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletonshttps://ml4code.github.io/publications/drain2021deepdebug/
Generating Bug-Fixes Using Pretrained Transformershttps://ml4code.github.io/publications/drain2021generating/
TFix: Learning to Fix Coding Errors with a Text-to-Text Transformerhttps://ml4code.github.io/publications/berabi2021tfix/
PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repairhttps://ml4code.github.io/publications/chen2021plur/
SelfAPR: Self-supervised Program Repair with Test Execution Diagnosticshttps://ml4code.github.io/publications/ye2022selfapr/
Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixeshttps://ml4code.github.io/publications/richter2022can/
Using Developer Discussions to Guide Fixing Bugs in Softwarehttps://ml4code.github.io/publications/panthaplackel2022using/
Demystifying GPT Self-Repair for Code Generationhttps://ml4code.github.io/publications/olausson2023demystifying/
TraceFixer: Execution Trace-Driven Program Repairhttps://ml4code.github.io/publications/bouzenia2023tracefixer/
Model-Agnostic Syntactical Information for Pre-Trained Programming Language Modelshttps://ml4code.github.io/publications/saberi2023model/
SkipAnalyzer: A Tool for Static Code Analysis with Large Language Modelshttps://ml4code.github.io/publications/mohajer2023skipanalyzer/
RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repairhttps://ml4code.github.io/publications/silva2023repairllama/
DebugBench: Evaluating Debugging Capability of Large Language Modelshttps://ml4code.github.io/publications/tian2024debugbench/
T5APR: Empowering Automated Program Repair across Languages through Checkpoint Ensemblehttps://ml4code.github.io/publications/gharibi2024t5apr/
RepairAgent: An Autonomous, LLM-Based Agent for Program Repairhttps://ml4code.github.io/publications/bouzenia2024repairagent/
DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Modelshttps://ml4code.github.io/publications/berabi2024deepcode/
Building Program Vector Representations for Deep Learninghttps://ml4code.github.io/publications/mou2014building/
Learning to Executehttps://ml4code.github.io/publications/zaremba2014learning/
Exploring the Use of Deep Learning for Feature Locationhttps://ml4code.github.io/publications/corley2015exploring/
Learning Program Embeddings to Propagate Feedback on Student Codehttps://ml4code.github.io/publications/piech2015learning/
Toward Deep Learning Software Repositorieshttps://ml4code.github.io/publications/white2015toward/
Graph-based Statistical Language Model for Codehttps://ml4code.github.io/publications/nguyen2015graph/
Learning to Generate Pseudo-code from Source Code using Statistical Machine Translationhttps://ml4code.github.io/publications/oda2015learning/
Learning API Usages from Bytecode: A Statistical Approachhttps://ml4code.github.io/publications/nguyen2016learning/
Convolutional Neural Networks over Tree Structures for Programming Language Processinghttps://ml4code.github.io/publications/mou2016convolutional/
Bugram: bug detection with n-gram language modelshttps://ml4code.github.io/publications/wang2016bugram/
Automatically Learning Semantic Features for Defect Predictionhttps://ml4code.github.io/publications/wang2016automatically/
Automatically generating features for learning program analysis heuristicshttps://ml4code.github.io/publications/chae2016automatically/
Semantically enhanced software traceability using deep learning techniqueshttps://ml4code.github.io/publications/guo2017semantically/
SmartPaste: Learning to Adapt Source Codehttps://ml4code.github.io/publications/allamanis2017smartpaste/
Neural Attribute Machines for Program Generationhttps://ml4code.github.io/publications/amodio2017neural/
Exploring API Embedding for API Usages and Applicationshttps://ml4code.github.io/publications/nguyen2017exploring/
Hierarchical Learning of Cross-Language Mappings through Distributed Vector Representations for Codehttps://ml4code.github.io/publications/bui2018hierarchical/
Bilateral Dependency Neural Networks for Cross-Language Algorithm Classificationhttps://ml4code.github.io/publications/bui2018bilateral/
Path-Based Function Embedding and its Application to Specification Mininghttps://ml4code.github.io/publications/defreez2018path/
Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networkshttps://ml4code.github.io/publications/bui2018cross/
Open Vocabulary Learning on Source Code with a Graph-Structured Cachehttps://ml4code.github.io/publications/cvitkovic2018open/
Learning to Represent Programs with Graphshttps://ml4code.github.io/publications/allamanis2018learning/
Deep Learning Similarities from Different Representations of Source Codehttps://ml4code.github.io/publications/tufano2018deep/
Neural Code Comprehension: A Learnable Representation of Code Semanticshttps://ml4code.github.io/publications/bennun2018neural/
Intelligent code reviews using deep learninghttps://ml4code.github.io/publications/gupta2018intelligent/
A General Path-Based Representation for Predicting Program Propertieshttps://ml4code.github.io/publications/alon2018general/
Deep Learning Type Inferencehttps://ml4code.github.io/publications/hellendoorn2018deep/
code2vec: Learning Distributed Representations of Codehttps://ml4code.github.io/publications/alon2019code2vec/
On the Feasibility of Transfer-learning Code Smells using Deep Learninghttps://ml4code.github.io/publications/sharma2019feasibility/
Mercem: Method Name Recommendation Based on Call Graph Embeddinghttps://ml4code.github.io/publications/yonai2019mercem/
Learning Execution through Neural Code Fusionhttps://ml4code.github.io/publications/shi2019learning/
Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networkshttps://ml4code.github.io/publications/li2019improving/
Capturing source code semantics via tree-based convolution over API-enhanced ASThttps://ml4code.github.io/publications/chen2019capturing/
Learning Uniform Semantic Features for Natural Language and Programming Language Globally, Locally and Sequentiallyhttps://ml4code.github.io/publications/zhang2019learning/
A Literature Study of Embeddings on Source Codehttps://ml4code.github.io/publications/chen2019literature/
code2seq: Generating Sequences from Structured Representations of Codehttps://ml4code.github.io/publications/alon2018code2seq/
SAR: Learning Cross-Language API Mappings with Little Knowledgehttps://ml4code.github.io/publications/bui2019learning/
Mining Likely Analogical APIs across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embeddinghttps://ml4code.github.io/publications/chen2019mining/
PathMiner : A Library for Mining of Path-Based Representations of Codehttps://ml4code.github.io/publications/kovalenko2019pathminer/
Import2vec - Learning Embeddings for Software Librarieshttps://ml4code.github.io/publications/theeten2019import2vec/
Semantic Source Code Models Using Identifier Embeddingshttps://ml4code.github.io/publications/efstathiou2019semantic/
Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimizationhttps://ml4code.github.io/publications/ding2019asm2vec/
Learning Scalable and Precise Representation of Program Semanticshttps://ml4code.github.io/publications/wang2019learning/
Program Classification Using Gated Graph Attention Neural Network for Online Programming Servicehttps://ml4code.github.io/publications/lu2019program/
Neural Attribution for Semantic Bug-Localization in Student Programshttps://ml4code.github.io/publications/gupta2019neural/
TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processinghttps://ml4code.github.io/publications/jayasundara2019treecaps/
A Novel Neural Source Code Representation based on Abstract Syntax Treehttps://ml4code.github.io/publications/zhang2019novel/
Modular Tree Network for Source Code Representation Learninghttps://ml4code.github.io/publications/wang2020modular/
Searching a Database of Source Codes Using Contextualized Code Searchhttps://ml4code.github.io/publications/mukherjee2020searching/
Towards Demystifying Dimensions of Source Code Embeddingshttps://ml4code.github.io/publications/rabin2020demystifying/
Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networkshttps://ml4code.github.io/publications/bieber2020learning/
Towards Learning Representations of Binary Executable Files for Security Taskshttps://ml4code.github.io/publications/arakelyan2020towards/
ComPy-Learn: A toolbox for exploring machine learning representations for compilershttps://ml4code.github.io/publications/brauckmann2020compy/
Compiler-based graph representations for deep learning models of codehttps://ml4code.github.io/publications/brauckmann2020compiler/
Contrastive Code Representation Learninghttps://ml4code.github.io/publications/jain2020contrastive/
Unsupervised Learning of General-Purpose Embeddings for Code Changeshttps://ml4code.github.io/publications/pravilov2021unsupervised/
Contrastive Learning for Source Code with Structural and Functional Propertieshttps://ml4code.github.io/publications/ding2021contrastive/
Disentangled Code Representation Learning for Multiple Programming Languageshttps://ml4code.github.io/publications/zhang2021disentangled/
IdBench: Evaluating Semantic Representations of Identifier Names in Source Codehttps://ml4code.github.io/publications/waunakh2019idbench/
Multimodal Representation for Neural Code Searchhttps://ml4code.github.io/publications/jian2021multimodal/
MulCode: A Multi-task Learning Approach for Source Code Understandinghttps://ml4code.github.io/publications/deze2021mulcode/
Language-Agnostic Representation Learning of Source Code from Structure and Contexthttps://ml4code.github.io/publications/zugner2021language/
InferCode: Self-Supervised Learning of Code Representations by Predicting Subtreeshttps://ml4code.github.io/publications/bui2021infercode/
Learning Program Semantics with Code Representations: An Empirical Studyhttps://ml4code.github.io/publications/siow2022learning/
Bridging Pre-trained Models and Downstream Tasks for Source Code Understandinghttps://ml4code.github.io/publications/deze2022bridging/
SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representationshttps://ml4code.github.io/publications/niu2022spt-code/
LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognitionhttps://ml4code.github.io/publications/sharma2022lamner/
An Exploratory Study on Code Attention in BERThttps://ml4code.github.io/publications/sharma2022exploratory/
CodeTrek: Flexible Modeling of Code using an Extensible Relational Representationhttps://ml4code.github.io/publications/pashakhanloo2022codetrek/
Topical: Learning Repository Embeddings from Source Code using Attentionhttps://ml4code.github.io/publications/lherondelle2022topical/
Rethinking Negative Pairs in Code Searchhttps://ml4code.github.io/publications/li2023rethinking/
Rethinking Negative Pairs in Code Searchhttps://ml4code.github.io/publications/li2023rethinking/
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generationhttps://ml4code.github.io/publications/zhang2023repocoder/
Learning to Reverse DNNs from AI Programs Automaticallyhttps://ml4code.github.io/publications/chen2022learning/
Will they like this? Evaluating Code Contributions With Language Modelshttps://ml4code.github.io/publications/hellendoorn2015will/
Intelligent code reviews using deep learninghttps://ml4code.github.io/publications/gupta2018intelligent/
CORE: Automating Review Recommendation for Code Changeshttps://ml4code.github.io/publications/siow2019core/
Deep Learning Approaches to Source Code Analysis for Optimization of Heterogeneous Systems: Recent Results, Challenges and Opportunitieshttps://ml4code.github.io/publications/barchi2022deep/
CodeReviewer: Pre-Training for Automating Code Review Activitieshttps://ml4code.github.io/publications/li2022codereviewer/
What is it like to program with artificial intelligence?https://ml4code.github.io/publications/sarkar2022what/
Aroma: code recommendation via structural code searchhttps://ml4code.github.io/publications/luan2019aroma/
A Bimodal Modelling of Source Code and Natural Languagehttps://ml4code.github.io/publications/allamanis2015bimodal/
Deep API Learninghttps://ml4code.github.io/publications/gu2016deep/
Deep Code Searchhttps://ml4code.github.io/publications/gu2018deep/
A Retrieve-and-Edit Framework for Predicting Structured Outputshttps://ml4code.github.io/publications/hashimoto2018retrieve/
CodeSearchNet Challenge: Evaluating the State of Semantic Code Searchhttps://ml4code.github.io/publications/husain2019codesearchnet/
Neural Code Search Evaluation Datasethttps://ml4code.github.io/publications/li2019neural/
Multi-Modal Attention Network Learning for Semantic Source Code Retrievalhttps://ml4code.github.io/publications/wan2019multimodal/
CoaCor: Code Annotation for Code Retrieval with Reinforcement Learninghttps://ml4code.github.io/publications/yao2019coacor/
When Deep Learning Met Code Searchhttps://ml4code.github.io/publications/cambronero2019deep/
Neural query expansion for code searchhttps://ml4code.github.io/publications/liu2019neural/
Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intenthttps://ml4code.github.io/publications/heyman2020neural/
CoNCRA: A Convolutional Neural Network Code Retrieval Approachhttps://ml4code.github.io/publications/derezendemartins2020concra/
Searching a Database of Source Codes Using Contextualized Code Searchhttps://ml4code.github.io/publications/mukherjee2020searching/
Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learninghttps://ml4code.github.io/publications/ye2020leveraging/
TranS^3: A Transformer-based Framework for Unifying Code Summarization and Code Searchhttps://ml4code.github.io/publications/wang2020trans/
PSCS: A Path-based Neural Model for Semantic Code Searchhttps://ml4code.github.io/publications/sun2020pscs/
Improving Code Search with Co-Attentive Representation Learninghttps://ml4code.github.io/publications/shuai2020improving/
A Multi-Perspective Architecture for Semantic Code Searchhttps://ml4code.github.io/publications/haldar2020multiperspective/
Adaptive Deep Code Searchhttps://ml4code.github.io/publications/ling2020adaptive/
Are the Code Snippets What We Are Searching for? A Benchmark and an Empirical Study on Code Search with Natural-Language Querieshttps://ml4code.github.io/publications/yan2020are/
NaturalCC: A Toolkit to Naturalize the Source Code Corpushttps://ml4code.github.io/publications/wan2020naturalcc/
Deep Graph Matching and Searching for Semantic Code Retrievalhttps://ml4code.github.io/publications/ling2020deep/
Learning Code-Query Interaction for Enhancing Code Searcheshttps://ml4code.github.io/publications/li2020learning/
OCoR: An Overlapping-Aware Code Retrieverhttps://ml4code.github.io/publications/zhu2020ocor/
CoSQA: 20,000+ Web Queries for Code Search and Question Answeringhttps://ml4code.github.io/publications/huang2021cosqa/
DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learninghttps://ml4code.github.io/publications/ellis2021dreamcoder/
Multimodal Representation for Neural Code Searchhttps://ml4code.github.io/publications/jian2021multimodal/
Bag-of-Words Baselines for Semantic Code Searchhttps://ml4code.github.io/publications/zhang2021bag/
Distilling Transformers for Neural Cross-Domain Searchhttps://ml4code.github.io/publications/clement2021distilling/
Leveraging Language to Learn Program Abstractions and Search Heuristicshttps://ml4code.github.io/publications/wong2021leveraging/
Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformationshttps://ml4code.github.io/publications/bui2021efficient/
Exploring Representation-Level Augmentation for Code Searchhttps://ml4code.github.io/publications/li2022exploring/
Senatus - A Fast and Accurate Code-to-Code Recommendation Enginehttps://ml4code.github.io/publications/silavong2022senatus/
DocCoder: Generating Code by Retrieving and Reading Docshttps://ml4code.github.io/publications/zhou2022docoder/
CodeDSI: Differentiable Code Searchhttps://ml4code.github.io/publications/nadeem2022codedsi/
Rethinking Negative Pairs in Code Searchhttps://ml4code.github.io/publications/li2023rethinking/
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Searchhttps://ml4code.github.io/publications/li2024rewriting/
A Learning-Based Approach to Static Program Slicinghttps://ml4code.github.io/publications/yadavally2024learning/
Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Toolshttps://ml4code.github.io/publications/koc2017learning/
Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IRhttps://ml4code.github.io/publications/barchi2019code/
Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networkshttps://ml4code.github.io/publications/zhou2019devign/
Predicting Vulnerability in Large Codebases With Deep Code Representationhttps://ml4code.github.io/publications/ashwath2020predicting/
Exploration of Convolutional Neural Network models for source code classificationhttps://ml4code.github.io/publications/barchi2021exploration/
Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mappinghttps://ml4code.github.io/publications/parisi2022making/
Learning to Answer Semantic Queries over Codehttps://ml4code.github.io/publications/sahu2022learning/
Learning to Reduce False Positives in Analytic Bug Detectorshttps://ml4code.github.io/publications/kharkar2022learning/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Modelshttps://ml4code.github.io/publications/li2023hitchhiker/
A Static Evaluation of Code Completion by Large Language Modelshttps://ml4code.github.io/publications/ding2023static/
(Partial) Program Dependence Learninghttps://ml4code.github.io/publications/yadavally2023partial/
Beware of the Unexpected: Bimodal Taint Analysishttps://ml4code.github.io/publications/chow2023beware/
Learning Natural Coding Conventionshttps://ml4code.github.io/publications/allamanis2014learning/
STYLE-ANALYZER: fixing code style inconsistencies with interpretable unsupervised algorithmshttps://ml4code.github.io/publications/markovtsev2019style/
Natural Language Models for Predicting Programming Commentshttps://ml4code.github.io/publications/movshovitz2013natural/
A Convolutional Attention Network for Extreme Summarization of Source Codehttps://ml4code.github.io/publications/allamanis2016convolutional/
Summarizing Source Code using a Neural Attention Modelhttps://ml4code.github.io/publications/iyer2016summarizing/
Autofolding for Source Code Summarizationhttps://ml4code.github.io/publications/fowkes2017autofolding/
Abridging Source Codehttps://ml4code.github.io/publications/yuan2017abridging/
CodeSum: Translate Program Language to Natural Languagehttps://ml4code.github.io/publications/hu2017codesum/
A parallel corpus of Python functions and documentation strings for automated code documentation and code generationhttps://ml4code.github.io/publications/barone2017parallel/
A Neural Architecture for Generating Natural Language Descriptions from Source Code Changeshttps://ml4code.github.io/publications/loyola2017neural/
Content Aware Source Code Change Description Generationhttps://ml4code.github.io/publications/loyola2018content/
Improving Automatic Source Code Summarization via Deep Reinforcement Learninghttps://ml4code.github.io/publications/wan2018improving/
Neural-Machine-Translation-Based Commit Message Generation: How Far Are We?https://ml4code.github.io/publications/liu2018neural/
code2vec: Learning Distributed Representations of Codehttps://ml4code.github.io/publications/alon2019code2vec/
A Neural Model for Method Name Generation from Functional Descriptionhttps://ml4code.github.io/publications/gao2019neural/
code2seq: Generating Sequences from Structured Representations of Codehttps://ml4code.github.io/publications/alon2018code2seq/
Commit Message Generation for Source Code Changeshttps://ml4code.github.io/publications/xu2019commit/
Code Generation as a Dual Task of Code Summarizationhttps://ml4code.github.io/publications/wei2019code/
Structured Neural Summarizationhttps://ml4code.github.io/publications/fernandes2019structured/
A Neural Model for Generating Natural Language Summaries of Program Subroutineshttps://ml4code.github.io/publications/leclair2019neural/
Recommendations for Datasets for Source Code Summarizationhttps://ml4code.github.io/publications/leclair2019recommendations/
Automatic Source Code Summarization with Extended Tree-LSTMhttps://ml4code.github.io/publications/shido2019automatic/
Improved Automatic Summarization of Subroutines via Attention to File Contexthttps://ml4code.github.io/publications/haque2020improved/
Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learninghttps://ml4code.github.io/publications/ye2020leveraging/
A Transformer-based Approach for Source Code Summarizationhttps://ml4code.github.io/publications/ahmad2020transformer/
PyMT5: multi-mode translation of natural language and Python code with transformershttps://ml4code.github.io/publications/clement2020pymt5/
NaturalCC: A Toolkit to Naturalize the Source Code Corpushttps://ml4code.github.io/publications/wan2020naturalcc/
Improved Code Summarization via a Graph Neural Networkhttps://ml4code.github.io/publications/leclair2020improved/
CoCoGUM: Contextual Code Summarization with Multi-Relational GNN on UMLshttps://ml4code.github.io/publications/wang2020cocogum/
Learning to Represent Programs with Heterogeneous Graphshttps://ml4code.github.io/publications/wang2020learning2/
On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformationshttps://ml4code.github.io/publications/rabin2021generalizability/
Retrieval Augmented Code Generation and Summarizationhttps://ml4code.github.io/publications/parvez2021retrieval/
Code to Comment Translation: A Comparative Study on Model Effectiveness & Errorshttps://ml4code.github.io/publications/mahmud2021code/
Learning to Describe Solutions for Bug Reports Based on Developer Discussionshttps://ml4code.github.io/publications/panthaplackel2021learning/
Assemble Foundation Models for Automatic Code Summarizationhttps://ml4code.github.io/publications/jian2022assemble/
InCoder: A Generative Model for Code Infilling and Synthesishttps://ml4code.github.io/publications/fried2022incoder/
LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognitionhttps://ml4code.github.io/publications/sharma2022lamner/
Learning code summarization from a small and local datasethttps://ml4code.github.io/publications/ahmed2022learning/
Improving Few-Shot Prompts with Relevant Static Analysis Productshttps://ml4code.github.io/publications/ahmed2033improving/
Model-Agnostic Syntactical Information for Pre-Trained Programming Language Modelshttps://ml4code.github.io/publications/saberi2023model/
A Survey on Deep Learning for Software Engineeringhttps://ml4code.github.io/publications/yang2020survey/
Neural Software Analysishttps://ml4code.github.io/publications/pradel2020neural/
Deep Learning & Software Engineering: State of Research and Future Directionshttps://ml4code.github.io/publications/devanbu2020deep/
Code to Comment Translation: A Comparative Study on Model Effectiveness & Errorshttps://ml4code.github.io/publications/mahmud2021code/
A Systematic Literature Review on the Use of Deep Learning in Software Engineering Researchhttps://ml4code.github.io/publications/watson2021systematic/
Deep Learning based Vulnerability Detection: Are We There Yet?https://ml4code.github.io/publications/chakraborty2020deep/
A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Taskshttps://ml4code.github.io/publications/casey2024survey/
NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulationhttps://ml4code.github.io/publications/gulwani2014nlyze/
Synthesizing Java expressions from free-form querieshttps://ml4code.github.io/publications/gvero2015synthesizing/
SPoC: Search-based Pseudocode to Codehttps://ml4code.github.io/publications/kulal2019spoc/
AutoPandas: neural-backed generators for program synthesishttps://ml4code.github.io/publications/bavishi2019autopandas/
Semantic Scaffolds for Pseudocode-to-Code Generationhttps://ml4code.github.io/publications/zhong2020semantic/
Unit Test Case Generation with Transformershttps://ml4code.github.io/publications/tufano2020unit/
Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformershttps://ml4code.github.io/publications/tufano2020generating/
IntelliCode Compose: Code Generation Using Transformerhttps://ml4code.github.io/publications/svyatkovskiy2020intellicode/
Evaluating Large Language Models Trained on Codehttps://ml4code.github.io/publications/chen2021evaluating/
DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learninghttps://ml4code.github.io/publications/ellis2021dreamcoder/
Program Synthesis with Large Language Modelshttps://ml4code.github.io/publications/nye2021program/
A large-scale benchmark for few-shot program induction and synthesishttps://ml4code.github.io/publications/alet2021largescale/
Leveraging Language to Learn Program Abstractions and Search Heuristicshttps://ml4code.github.io/publications/wong2021leveraging/
Neural Program Generation Modulo Static Analysishttps://ml4code.github.io/publications/mukherjee2021neural/
A Conversational Paradigm for Program Synthesishttps://ml4code.github.io/publications/nijkamp2022conversational/
I Speak, You Verify: Toward Trustworthy Neural Program Synthesishttps://ml4code.github.io/publications/key2022speak/
CodeT: Code Generation with Generated Testshttps://ml4code.github.io/publications/chen2022codet/
Grounded Copilot: How Programmers Interact with Code-Generating Modelshttps://ml4code.github.io/publications/barke2022grounded/
Unit Test Case Generation with Transformershttps://ml4code.github.io/publications/tufano2020unit/
Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformershttps://ml4code.github.io/publications/tufano2020generating/
TOGA: A Neural Method for Test Oracle Generationhttps://ml4code.github.io/publications/dinella2022toga/
Test-based and metric-based evaluation of code generation models for practical question answeringhttps://ml4code.github.io/publications/kovalchuk2023test/
PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Codehttps://ml4code.github.io/publications/spirin2021psiminer/
Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Contexthttps://ml4code.github.io/publications/agrawal2023monitor/
(Partial) Program Dependence Learninghttps://ml4code.github.io/publications/yadavally2023partial/
A Learning-Based Approach to Static Program Slicinghttps://ml4code.github.io/publications/yadavally2024learning/
Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learninghttps://ml4code.github.io/publications/yadavally2024predictive/
Topic modeling of public repositories at scale using names in source codehttps://ml4code.github.io/publications/markovtsev2017topic/
Topical: Learning Repository Embeddings from Source Code using Attentionhttps://ml4code.github.io/publications/lherondelle2022topical/
Semantically enhanced software traceability using deep learning techniqueshttps://ml4code.github.io/publications/guo2017semantically/
Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repairhttps://ml4code.github.io/publications/tian2020evaluating/
Global Relational Models of Source Codehttps://ml4code.github.io/publications/hellendoorn2020global/
Empirical Study of Transformers for Source Codehttps://ml4code.github.io/publications/chirkova2020empirical/
Self-Supervised Bug Detection and Repairhttps://ml4code.github.io/publications/allamanis2021self/
Retrieval Augmented Code Generation and Summarizationhttps://ml4code.github.io/publications/parvez2021retrieval/
Show Your Work: Scratchpads for Intermediate Computation with Language Modelshttps://ml4code.github.io/publications/nye2021show/
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedbackhttps://ml4code.github.io/publications/wu2021prototransformer/
CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Modelhttps://ml4code.github.io/publications/jung2021commitbert/
CoTexT: Multi-task Learning with Code-Text Transformerhttps://ml4code.github.io/publications/phan2021cotext/
Code to Comment Translation: A Comparative Study on Model Effectiveness & Errorshttps://ml4code.github.io/publications/mahmud2021code/
ConTest: A Unit Test Completion Benchmark featuring Contexthttps://ml4code.github.io/publications/villmow2021contest/
Contrastive Learning for Source Code with Structural and Functional Propertieshttps://ml4code.github.io/publications/ding2021contrastive/
Jointly Learning to Repair Code and Generate Commit Messagehttps://ml4code.github.io/publications/bai2021jointly/
Co-Training for Commit Classificationhttps://ml4code.github.io/publications/lee2021cotraining/
Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Modelshttps://ml4code.github.io/publications/li2021toward/
Learning Type Annotation: Is Big Data Enough?https://ml4code.github.io/publications/jesse2021learning/
TreeBERT: A Tree-Based Pre-Trained Model for Programming Languagehttps://ml4code.github.io/publications/jiang2021treebert/
Program Synthesis with Large Language Modelshttps://ml4code.github.io/publications/nye2021program/
An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributionshttps://ml4code.github.io/publications/pearce2021empirical/
Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generationhttps://ml4code.github.io/publications/orlanski2021reading/
How could Neural Networks understand Programs?https://ml4code.github.io/publications/peng2021how/
Learning to Extend Program Graphs to Work-in-Progress Codehttps://ml4code.github.io/publications/li2021learning/
DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletonshttps://ml4code.github.io/publications/drain2021deepdebug/
Generating Bug-Fixes Using Pretrained Transformershttps://ml4code.github.io/publications/drain2021generating/
CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generationhttps://ml4code.github.io/publications/wang2021codet5/
Improving Code Autocompletion with Transfer Learninghttps://ml4code.github.io/publications/zhou2021improving/
Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchyhttps://ml4code.github.io/publications/clement2021long/
Language-Agnostic Representation Learning of Source Code from Structure and Contexthttps://ml4code.github.io/publications/zugner2021language/
Distilling Transformers for Neural Cross-Domain Searchhttps://ml4code.github.io/publications/clement2021distilling/
Time-Efficient Code Completion Model for the R Programming Languagehttps://ml4code.github.io/publications/popov2021time/
What do pre-trained code models know about code?https://ml4code.github.io/publications/karmakar2021what/
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computinghttps://ml4code.github.io/publications/elnaggar2021codetrans/
DIRECT : A Transformer-based Model for Decompiled Identifier Renaminghttps://ml4code.github.io/publications/nitin2021direct/
On Multi-Modal Learning of Editing Source Codehttps://ml4code.github.io/publications/chakraborty2021multimodal/
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generationhttps://ml4code.github.io/publications/lu2021codexglue/
Unified Pre-training for Program Understanding and Generationhttps://ml4code.github.io/publications/ahmad2021unified/
Code Translation with Compiler Representationshttps://ml4code.github.io/publications/szafraniec2022code/
Learning to Model Editing Processeshttps://ml4code.github.io/publications/reid2022learning/
SantaCoder: don’t reach for the stars!https://ml4code.github.io/publications/allal2022santacoder/
Learning To Predict User-Defined Typeshttps://ml4code.github.io/publications/jesse2022learning/
Efficient Training of Language Models to Fill in the Middlehttps://ml4code.github.io/publications/bavarian2022efficient/
CoditT5: Pretraining for Source Code and Natural Language Editinghttps://ml4code.github.io/publications/zhang2022coditt5/
Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic?https://ml4code.github.io/publications/doderlein2022piloting/
Exploring Representation-Level Augmentation for Code Searchhttps://ml4code.github.io/publications/li2022exploring/
A Systematic Evaluation of Large Language Models of Codehttps://ml4code.github.io/publications/xu2022systematic/
A Conversational Paradigm for Program Synthesishttps://ml4code.github.io/publications/nijkamp2022conversational/
Synchromesh: Reliable code generation from pre-trained language modelshttps://ml4code.github.io/publications/poesia2022synchromesh/
An Extensive Study on Pre-trained Models for Program Understanding and Generationhttps://ml4code.github.io/publications/zeng2022extensive/
TOGA: A Neural Method for Test Oracle Generationhttps://ml4code.github.io/publications/dinella2022toga/
Learning to Complete Code with Sketcheshttps://ml4code.github.io/publications/guo2022learning/
UniXcoder: Unified Cross-Modal Pre-training for Code Representationhttps://ml4code.github.io/publications/guo2022unixcoder/
Repository-Level Prompt Generation for Large Language Models of Codehttps://ml4code.github.io/publications/shrivastava2020repository/
InCoder: A Generative Model for Code Infilling and Synthesishttps://ml4code.github.io/publications/fried2022incoder/
Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixeshttps://ml4code.github.io/publications/richter2022can/
CodeT: Code Generation with Generated Testshttps://ml4code.github.io/publications/chen2022codet/
SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representationshttps://ml4code.github.io/publications/niu2022spt-code/
DocCoder: Generating Code by Retrieving and Reading Docshttps://ml4code.github.io/publications/zhou2022docoder/
ReACC: A Retrieval-Augmented Code Completion Frameworkhttps://ml4code.github.io/publications/lu2022reacc/
Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysishttps://ml4code.github.io/publications/naik2022probing/
Using Developer Discussions to Guide Fixing Bugs in Softwarehttps://ml4code.github.io/publications/panthaplackel2022using/
CV4Code: Sourcecode Understanding via Visual Code Representationshttps://ml4code.github.io/publications/shi2022cv4code/
An Exploratory Study on Code Attention in BERThttps://ml4code.github.io/publications/sharma2022exploratory/
Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsinghttps://ml4code.github.io/publications/patil2022exploring/
Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Codehttps://ml4code.github.io/publications/bareiss2022code/
Learning code summarization from a small and local datasethttps://ml4code.github.io/publications/ahmed2022learning/
Learning to Answer Semantic Queries over Codehttps://ml4code.github.io/publications/sahu2022learning/
DeepPERF: A Deep Learning-Based Approach For Improving Software Performancehttps://ml4code.github.io/publications/garg2022deepperf/
Using Deep Learning to Generate Complete Log Statementshttps://ml4code.github.io/publications/mastropaolo2022using/
Learning to Reduce False Positives in Analytic Bug Detectorshttps://ml4code.github.io/publications/kharkar2022learning/
Exploring and Evaluating Personalized Models for Code Generationhttps://ml4code.github.io/publications/zlotchevski2022exploring/
What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Codehttps://ml4code.github.io/publications/wan2022what/
Improving Few-Shot Prompts with Relevant Static Analysis Productshttps://ml4code.github.io/publications/ahmed2033improving/
CodeBERTScore: Evaluating Code Generation with Pretrained Models of Codehttps://ml4code.github.io/publications/zhou2022codebertscore/
StarCoder: may the source be with you!https://ml4code.github.io/publications/li2023starcoder/
Large Language Models and Simple, Stupid Bugshttps://ml4code.github.io/publications/jesse2023large/
TypeT5: Seq2seq Type Inference using Static Analysishttps://ml4code.github.io/publications/wei2023typet5/
TraceFixer: Execution Trace-Driven Program Repairhttps://ml4code.github.io/publications/bouzenia2023tracefixer/
Model-Agnostic Syntactical Information for Pre-Trained Programming Language Modelshttps://ml4code.github.io/publications/saberi2023model/
Rethinking Negative Pairs in Code Searchhttps://ml4code.github.io/publications/li2023rethinking/
CodeGen2: Lessons for Training LLMs on Programming and Natural Languageshttps://ml4code.github.io/publications/nijkamp2023codegen2/
RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generationhttps://ml4code.github.io/publications/zhang2023repocoder/
Code Execution with Pre-trained Language Modelshttps://ml4code.github.io/publications/liu2023code/
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detectionhttps://ml4code.github.io/publications/chen2023diversevul/
CodeScore: Evaluating Code Generation by Learning Code Executionhttps://ml4code.github.io/publications/dong2023codescore/
CodeT5+: Open Code Large Language Models for Code Understanding and Generationhttps://ml4code.github.io/publications/wang2023codet5/
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generationhttps://ml4code.github.io/publications/li2023think/
T5APR: Empowering Automated Program Repair across Languages through Checkpoint Ensemblehttps://ml4code.github.io/publications/gharibi2024t5apr/
Studying LLM Performance on Closed- and Open-source Datahttps://ml4code.github.io/publications/ahmed2024studying/
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligencehttps://ml4code.github.io/publications/guo2024deepseek/
Automatically Testing Functional Properties of Code Translation Modelshttps://ml4code.github.io/publications/eniser2023automatically/
LLM4Decompile: Decompiling Binary Code with Large Language Modelshttps://ml4code.github.io/publications/tan2024llm4decompile/
Predicting Program Properties from “Big Code”https://ml4code.github.io/publications/raychev2015predicting/
RefiNym: Using Names to Refine Typeshttps://ml4code.github.io/publications/dash2018refinym/
Deep Learning Type Inferencehttps://ml4code.github.io/publications/hellendoorn2018deep/
TypeWriter: Neural Type Prediction with Search-based Validationhttps://ml4code.github.io/publications/pradel2019typewriter/
NL2Type: Inferring JavaScript Function Types from Natural Language Informationhttps://ml4code.github.io/publications/malik2019nl2type/
Inferring Javascript types using Graph Neural Networkshttps://ml4code.github.io/publications/schrouff2019inferring/
Learning Lenient Parsing & Typing via Indirect Supervisionhttps://ml4code.github.io/publications/ahmed2019learning/
OptTyper: Probabilistic Type Inference by Optimising Logical and Natural Constraintshttps://ml4code.github.io/publications/pandi2020opttyper/
LambdaNet: Probabilistic Type Inference using Graph Neural Networkshttps://ml4code.github.io/publications/wei2020lambdanet/
Adversarial Robustness for Codehttps://ml4code.github.io/publications/bielik2020adversarial/
Typilus: Neural Type Hintshttps://ml4code.github.io/publications/allamanis2020typilus/
Learning Type Annotation: Is Big Data Enough?https://ml4code.github.io/publications/jesse2021learning/
ManyTypes4Py: A Benchmark Python Dataset for Machine Learning-based Type Inferencehttps://ml4code.github.io/publications/mir2021manytypes4py/
Type4Py: Deep Similarity Learning-Based Type Inference for Pythonhttps://ml4code.github.io/publications/mir2021type4py/
Learning To Predict User-Defined Typeshttps://ml4code.github.io/publications/jesse2022learning/
LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognitionhttps://ml4code.github.io/publications/sharma2022lamner/
TypeT5: Seq2seq Type Inference using Static Analysishttps://ml4code.github.io/publications/wei2023typet5/
Generative Type Inference for Pythonhttps://ml4code.github.io/publications/peng2023generative/
SmartPaste: Learning to Adapt Source Codehttps://ml4code.github.io/publications/allamanis2017smartpaste/
Open Vocabulary Learning on Source Code with a Graph-Structured Cachehttps://ml4code.github.io/publications/cvitkovic2018open/
Learning to Represent Programs with Graphshttps://ml4code.github.io/publications/allamanis2018learning/
Neural Program Repair by Jointly Learning to Localize and Repairhttps://ml4code.github.io/publications/vasic2019neural/
Global Relational Models of Source Codehttps://ml4code.github.io/publications/hellendoorn2020global/
CodeTrek: Flexible Modeling of Code using an Extensible Relational Representationhttps://ml4code.github.io/publications/pashakhanloo2022codetrek/
Learning Loop Invariants for Program Verificationhttps://ml4code.github.io/publications/si2018learning/
ConTest: A Unit Test Completion Benchmark featuring Contexthttps://ml4code.github.io/publications/villmow2021contest/
DeepVD: Toward Class-Separation Features for Neural Network Vulnerability Detectionhttps://ml4code.github.io/publications/wang2023deepvd/
DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detectionhttps://ml4code.github.io/publications/chen2023diversevul/
DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Modelshttps://ml4code.github.io/publications/berabi2024deepcode/
A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Taskshttps://ml4code.github.io/publications/casey2024survey/

Viewport: width=device-width, initial-scale=1.0, maximum-scale=1


URLs of crawlers that visited me.