Title: Publications by Tag · Machine Learning for Big Code and Naturalness
Keywords:
Domain: ml4code.github.io
| None | text/html; charset=utf-8 |
Links:
| Contribute to ML4Code | https://ml4code.github.io/contributing.html |
| Machine Learning for Big Code and Naturalness | https://ml4code.github.io/ |
| List of Papers | https://ml4code.github.io/papers.html |
| Papers by Tag | https://ml4code.github.io/tags.html |
| 2D Map of Papers | https://ml4code.github.io/tsne-viz.html |
| Topic-based Explorer | https://ml4code.github.io/topic-viz.html |
| Resources, Courses & Events | https://ml4code.github.io/resources.html |
| Contributing | https://ml4code.github.io/contributing.html |
| Miltos Allamanis | https://miltos.allamanis.com |
| Jekyll | https://jekyllrb.com |
| Hyde | https://github.com/poole/hyde |
| adversarial | https://ml4code.github.io/tags.html#adversarial |
| API | https://ml4code.github.io/tags.html#API |
| autocomplete | https://ml4code.github.io/tags.html#autocomplete |
| benchmark | https://ml4code.github.io/tags.html#benchmark |
| benchmarking | https://ml4code.github.io/tags.html#benchmarking |
| bimodal | https://ml4code.github.io/tags.html#bimodal |
| Binary Code | https://ml4code.github.io/tags.html#Binary Code |
| clone | https://ml4code.github.io/tags.html#clone |
| code completion | https://ml4code.github.io/tags.html#code completion |
| code generation | https://ml4code.github.io/tags.html#code generation |
| code similarity | https://ml4code.github.io/tags.html#code similarity |
| compilation | https://ml4code.github.io/tags.html#compilation |
| completion | https://ml4code.github.io/tags.html#completion |
| cybersecurity | https://ml4code.github.io/tags.html#cybersecurity |
| dataset | https://ml4code.github.io/tags.html#dataset |
| decompilation | https://ml4code.github.io/tags.html#decompilation |
| defect | https://ml4code.github.io/tags.html#defect |
| deobfuscation | https://ml4code.github.io/tags.html#deobfuscation |
| documentation | https://ml4code.github.io/tags.html#documentation |
| dynamic | https://ml4code.github.io/tags.html#dynamic |
| edit | https://ml4code.github.io/tags.html#edit |
| editing | https://ml4code.github.io/tags.html#editing |
| education | https://ml4code.github.io/tags.html#education |
| evaluation | https://ml4code.github.io/tags.html#evaluation |
| execution | https://ml4code.github.io/tags.html#execution |
| feature location | https://ml4code.github.io/tags.html#feature location |
| fuzzing | https://ml4code.github.io/tags.html#fuzzing |
| generalizability | https://ml4code.github.io/tags.html#generalizability |
| generation | https://ml4code.github.io/tags.html#generation |
| GNN | https://ml4code.github.io/tags.html#GNN |
| grammar | https://ml4code.github.io/tags.html#grammar |
| human evaluation | https://ml4code.github.io/tags.html#human evaluation |
| information extraction | https://ml4code.github.io/tags.html#information extraction |
| instruction tuning | https://ml4code.github.io/tags.html#instruction tuning |
| interpretability | https://ml4code.github.io/tags.html#interpretability |
| language model | https://ml4code.github.io/tags.html#language model |
| large language models | https://ml4code.github.io/tags.html#large language models |
| LLM | https://ml4code.github.io/tags.html#LLM |
| logging | https://ml4code.github.io/tags.html#logging |
| memorization | https://ml4code.github.io/tags.html#memorization |
| metrics | https://ml4code.github.io/tags.html#metrics |
| migration | https://ml4code.github.io/tags.html#migration |
| naming | https://ml4code.github.io/tags.html#naming |
| natural language generation | https://ml4code.github.io/tags.html#natural language generation |
| natural language processing | https://ml4code.github.io/tags.html#natural language processing |
| notebook | https://ml4code.github.io/tags.html#notebook |
| optimization | https://ml4code.github.io/tags.html#optimization |
| pattern mining | https://ml4code.github.io/tags.html#pattern mining |
| plagiarism detection | https://ml4code.github.io/tags.html#plagiarism detection |
| pretraining | https://ml4code.github.io/tags.html#pretraining |
| program analysis | https://ml4code.github.io/tags.html#program analysis |
| program synthesis | https://ml4code.github.io/tags.html#program synthesis |
| question answering | https://ml4code.github.io/tags.html#question answering |
| refactoring | https://ml4code.github.io/tags.html#refactoring |
| repair | https://ml4code.github.io/tags.html#repair |
| representation | https://ml4code.github.io/tags.html#representation |
| retrieval | https://ml4code.github.io/tags.html#retrieval |
| Reverse Engineering | https://ml4code.github.io/tags.html#Reverse Engineering |
| review | https://ml4code.github.io/tags.html#review |
| search | https://ml4code.github.io/tags.html#search |
| static | https://ml4code.github.io/tags.html#static |
| static analysis | https://ml4code.github.io/tags.html#static analysis |
| style | https://ml4code.github.io/tags.html#style |
| summarization | https://ml4code.github.io/tags.html#summarization |
| survey | https://ml4code.github.io/tags.html#survey |
| synthesis | https://ml4code.github.io/tags.html#synthesis |
| test generation | https://ml4code.github.io/tags.html#test generation |
| tool | https://ml4code.github.io/tags.html#tool |
| topic modeling | https://ml4code.github.io/tags.html#topic modeling |
| topic modelling | https://ml4code.github.io/tags.html#topic modelling |
| traceability | https://ml4code.github.io/tags.html#traceability |
| Transformer | https://ml4code.github.io/tags.html#Transformer |
| Transformers | https://ml4code.github.io/tags.html#Transformers |
| translation | https://ml4code.github.io/tags.html#translation |
| types | https://ml4code.github.io/tags.html#types |
| variable misuse | https://ml4code.github.io/tags.html#variable misuse |
| verification | https://ml4code.github.io/tags.html#verification |
| vulnerability | https://ml4code.github.io/tags.html#vulnerability |
| Adversarial Examples for Models of Code | https://ml4code.github.io/publications/yefet2019adversarial/ |
| Generating Adversarial Examples for Holding Robustness of Source Code Processing Models | https://ml4code.github.io/publications/zhang2020generating/ |
| Adversarial Robustness for Code | https://ml4code.github.io/publications/bielik2020adversarial/ |
| Embedding Java Classes with code2vec: Improvements from Variable Obfuscation | https://ml4code.github.io/publications/compton2020embedding/ |
| On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations | https://ml4code.github.io/publications/rabin2021generalizability/ |
| You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion | https://ml4code.github.io/publications/schuster2021you/ |
| Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022understanding/ |
| Semantic Robustness of Models of Source Code | https://ml4code.github.io/publications/henkel2020semantic/ |
| Backdoors in Neural Models of Source Code | https://ml4code.github.io/publications/ramakrishnan2020backdoors/ |
| Lexical Statistical Machine Translation for Language Migration | https://ml4code.github.io/publications/nguyen2013lexical/ |
| Statistical Learning Approach for Mining API Usage Mappings for Code Migration | https://ml4code.github.io/publications/nguyen2014statistical/ |
| Parameter-Free Probabilistic API Mining across GitHub | https://ml4code.github.io/publications/fowkes2016parameter/ |
| Learning API Usages from Bytecode: A Statistical Approach | https://ml4code.github.io/publications/nguyen2016learning/ |
| Deep API Learning | https://ml4code.github.io/publications/gu2016deep/ |
| Mapping API Elements for Code Migration with Vector Representations | https://ml4code.github.io/publications/nguyen2016mapping/ |
| DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning | https://ml4code.github.io/publications/gu2017deepam/ |
| Function Assistant: A Tool for NL Querying of APIs | https://ml4code.github.io/publications/richardson2017function/ |
| Learning Technical Correspondences in Technical Documentation | https://ml4code.github.io/publications/richardson2017learning/ |
| Exploring API Embedding for API Usages and Applications | https://ml4code.github.io/publications/nguyen2017exploring/ |
| Finding Likely Errors with Bayesian Specifications | https://ml4code.github.io/publications/murali2017finding/ |
| Bayesian Sketch Learning for Program Synthesis | https://ml4code.github.io/publications/murali2017bayesian/ |
| Polyglot Semantic Parsing in APIs | https://ml4code.github.io/publications/richardson2018polyglot/ |
| Unsupervised Learning of API Aliasing Specifications | https://ml4code.github.io/publications/ederhardt2019unsupervised/ |
| SAR: Learning Cross-Language API Mappings with Little Knowledge | https://ml4code.github.io/publications/bui2019learning/ |
| Mining Likely Analogical APIs across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding | https://ml4code.github.io/publications/chen2019mining/ |
| AutoPandas: neural-backed generators for program synthesis | https://ml4code.github.io/publications/bavishi2019autopandas/ |
| Learning from Examples to Improve Code Completion Systems | https://ml4code.github.io/publications/bruch2009learning/ |
| On the Naturalness of Software | https://ml4code.github.io/publications/hindle2012naturalness/ |
| Code Completion with Statistical Language Models | https://ml4code.github.io/publications/raychev2014code/ |
| Graph-based Statistical Language Model for Code | https://ml4code.github.io/publications/nguyen2015graph/ |
| Intelligent Code Completion with Bayesian Networks | https://ml4code.github.io/publications/proksch2015intelligent/ |
| Learning Python Code Suggestion with a Sparse Pointer Network | https://ml4code.github.io/publications/bhoopchand2016learning/ |
| Neural Code Completion | https://ml4code.github.io/publications/wang2016neural/ |
| Code Completion with Neural Attention and Pointer Networks | https://ml4code.github.io/publications/li2017code/ |
| Pythia: AI-assisted Code Completion System | https://ml4code.github.io/publications/svyatkovskiy2019pythia/ |
| Learning Autocompletion from Real-World Datasets | https://ml4code.github.io/publications/aye2020learning/ |
| Sequence Model Design for Code Completion in the Modern IDE | https://ml4code.github.io/publications/aye2020sequence/ |
| Code Prediction by Feeding Trees to Transformers | https://ml4code.github.io/publications/kim2020code/ |
| A Structural Model for Contextual Code Changes | https://ml4code.github.io/publications/brody2020structural/ |
| IntelliCode Compose: Code Generation Using Transformer | https://ml4code.github.io/publications/svyatkovskiy2020intellicode/ |
| Fast and Memory-Efficient Neural Code Completion | https://ml4code.github.io/publications/svyatkovskiy2020fast/ |
| On-the-Fly Adaptation of Source Code Models using Meta-Learning | https://ml4code.github.io/publications/shrivastava2020on-the-fly/ |
| Suggesting Comment Completions for Python using Neural Language Models | https://ml4code.github.io/publications/ciurumelea2020suggesting/ |
| Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models | https://ml4code.github.io/publications/li2021toward/ |
| Learning to Extend Program Graphs to Work-in-Progress Code | https://ml4code.github.io/publications/li2021learning/ |
| Improving Code Autocompletion with Transfer Learning | https://ml4code.github.io/publications/zhou2021improving/ |
| You Autocomplete Me: Poisoning Vulnerabilities in Neural Code Completion | https://ml4code.github.io/publications/schuster2021you/ |
| On the Embeddings of Variables in Recurrent Neural Networks for Source Code | https://ml4code.github.io/publications/chirkova2021embeddings/ |
| ReACC: A Retrieval-Augmented Code Completion Framework | https://ml4code.github.io/publications/lu2022reacc/ |
| All You Need Is Logs: Improving Code Completion by Learning from Anonymous IDE Usage Logs | https://ml4code.github.io/publications/bibaev2022all/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| ConTest: A Unit Test Completion Benchmark featuring Context | https://ml4code.github.io/publications/villmow2021contest/ |
| CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation | https://ml4code.github.io/publications/lu2021codexglue/ |
| Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsing | https://ml4code.github.io/publications/patil2022exploring/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models | https://ml4code.github.io/publications/chen2024ppm/ |
| Natural Language Models for Predicting Programming Comments | https://ml4code.github.io/publications/movshovitz2013natural/ |
| Using Semantic Unification to Generate Regular Expressions from Natural Language | https://ml4code.github.io/publications/kushman2013using/ |
| NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation | https://ml4code.github.io/publications/gulwani2014nlyze/ |
| Synthesizing Java expressions from free-form queries | https://ml4code.github.io/publications/gvero2015synthesizing/ |
| Learning to Generate Pseudo-code from Source Code using Statistical Machine Translation | https://ml4code.github.io/publications/oda2015learning/ |
| A Bimodal Modelling of Source Code and Natural Language | https://ml4code.github.io/publications/allamanis2015bimodal/ |
| Summarizing Source Code using a Neural Attention Model | https://ml4code.github.io/publications/iyer2016summarizing/ |
| Latent Predictor Networks for Code Generation | https://ml4code.github.io/publications/ling2016latent/ |
| CodeSum: Translate Program Language to Natural Language | https://ml4code.github.io/publications/hu2017codesum/ |
| Automatically Generating Commit Messages from Diffs using Neural Machine Translation | https://ml4code.github.io/publications/jiang2017automatically/ |
| Program Synthesis from Natural Language Using Recurrent Neural Networks | https://ml4code.github.io/publications/lin2017program/ |
| pix2code: Generating Code from a Graphical User Interface Screenshot | https://ml4code.github.io/publications/beltramelli2017pix2code/ |
| Function Assistant: A Tool for NL Querying of APIs | https://ml4code.github.io/publications/richardson2017function/ |
| The Code2Text Challenge: Text Generation in Source Code Libraries | https://ml4code.github.io/publications/richardson2017code2text/ |
| A Syntactic Neural Model for General-Purpose Code Generation | https://ml4code.github.io/publications/yin2017syntactic/ |
| Learning Technical Correspondences in Technical Documentation | https://ml4code.github.io/publications/richardson2017learning/ |
| Generating Regular Expressions from Natural Language Specifications: Are We There Yet? | https://ml4code.github.io/publications/zhong2018generating/ |
| Mapping Language to Code in Programmatic Context | https://ml4code.github.io/publications/iyer2018mapping/ |
| Deep Learning to Detect Redundant Method Comments | https://ml4code.github.io/publications/louis2018deep/ |
| NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System | https://ml4code.github.io/publications/lin2018nl2bash/ |
| Polyglot Semantic Parsing in APIs | https://ml4code.github.io/publications/richardson2018polyglot/ |
| A Retrieve-and-Edit Framework for Predicting Structured Outputs | https://ml4code.github.io/publications/hashimoto2018retrieve/ |
| TypeWriter: Neural Type Prediction with Search-based Validation | https://ml4code.github.io/publications/pradel2019typewriter/ |
| SPoC: Search-based Pseudocode to Code | https://ml4code.github.io/publications/kulal2019spoc/ |
| JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generation | https://ml4code.github.io/publications/agashe2019julce/ |
| Learning Uniform Semantic Features for Natural Language and Programming Language Globally, Locally and Sequentially | https://ml4code.github.io/publications/zhang2019learning/ |
| NL2Type: Inferring JavaScript Function Types from Natural Language Information | https://ml4code.github.io/publications/malik2019nl2type/ |
| OptTyper: Probabilistic Type Inference by Optimising Logical and Natural Constraints | https://ml4code.github.io/publications/pandi2020opttyper/ |
| Incorporating External Knowledge through Pre-training for Natural Language to Code Generation | https://ml4code.github.io/publications/xu2020incorporating/ |
| Associating Natural Language Comment and Source Code Entities | https://ml4code.github.io/publications/panthaplackel2020associating/ |
| TAG : Type Auxiliary Guiding for Code Comment Generation | https://ml4code.github.io/publications/cai2020tag/ |
| Deep Just-In-Time Inconsistency Detection Between Comments and Source Code | https://ml4code.github.io/publications/panthaplackel2020deep/ |
| Code to Comment "Translation": Data, Metrics, Baselining & Evaluation | https://ml4code.github.io/publications/gros2020code/ |
| Learning to Update Natural Language Comments Based on Code Changes | https://ml4code.github.io/publications/panthaplackel2020learning/ |
| PyMT5: multi-mode translation of natural language and Python code with transformers | https://ml4code.github.io/publications/clement2020pymt5/ |
| Where should I comment my code? A dataset and model for predicting locations that need comments | https://ml4code.github.io/publications/louis2020where/ |
| Suggesting Comment Completions for Python using Neural Language Models | https://ml4code.github.io/publications/ciurumelea2020suggesting/ |
| Co-Training for Commit Classification | https://ml4code.github.io/publications/lee2021cotraining/ |
| Learning to Reverse DNNs from AI Programs Automatically | https://ml4code.github.io/publications/chen2022learning/ |
| Deep Learning Code Fragments for Code Clone Detection | https://ml4code.github.io/publications/white2016deep/ |
| Oreo: detection of clones in the twilight zone | https://ml4code.github.io/publications/saini2018oreo/ |
| Deep Learning Similarities from Different Representations of Source Code | https://ml4code.github.io/publications/tufano2018deep/ |
| Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization | https://ml4code.github.io/publications/ding2019asm2vec/ |
| Learning-based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detection | https://ml4code.github.io/publications/buech2019learning/ |
| funcGNN: A Graph Neural Network Approach to Program Similarity | https://ml4code.github.io/publications/nair2020funcgnn/ |
| Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree | https://ml4code.github.io/publications/wang2020detecting/ |
| Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks | https://ml4code.github.io/publications/mehrotra2020modeling/ |
| Cross-Language Binary-Source Code Matching with Intermediate Representations | https://ml4code.github.io/publications/gui2022cross/ |
| An Exploratory Study on Code Attention in BERT | https://ml4code.github.io/publications/sharma2022exploratory/ |
| Repository-Level Prompt Generation for Large Language Models of Code | https://ml4code.github.io/publications/shrivastava2020repository/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| A Machine Learning Framework for Programming by Example | https://ml4code.github.io/publications/menon2013machine/ |
| Using Semantic Unification to Generate Regular Expressions from Natural Language | https://ml4code.github.io/publications/kushman2013using/ |
| Structured Generative Models of Natural Source Code | https://ml4code.github.io/publications/maddison2014structured/ |
| Code Completion with Statistical Language Models | https://ml4code.github.io/publications/raychev2014code/ |
| NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation | https://ml4code.github.io/publications/gulwani2014nlyze/ |
| Phrase-Based Statistical Translation of Programming Languages | https://ml4code.github.io/publications/karaivanov2014phrase/ |
| Synthesizing Java expressions from free-form queries | https://ml4code.github.io/publications/gvero2015synthesizing/ |
| Visualizing and Understanding Recurrent Networks | https://ml4code.github.io/publications/karpathy2015visualizing/ |
| A deep language model for software code | https://ml4code.github.io/publications/dam2016deep/ |
| Learning Programs from Noisy Data | https://ml4code.github.io/publications/raychev2016learning/ |
| PHOG: Probabilistic Model for Code | https://ml4code.github.io/publications/bielik2016phog/ |
| Latent Predictor Networks for Code Generation | https://ml4code.github.io/publications/ling2016latent/ |
| Program Synthesis from Natural Language Using Recurrent Neural Networks | https://ml4code.github.io/publications/lin2017program/ |
| pix2code: Generating Code from a Graphical User Interface Screenshot | https://ml4code.github.io/publications/beltramelli2017pix2code/ |
| A Syntactic Neural Model for General-Purpose Code Generation | https://ml4code.github.io/publications/yin2017syntactic/ |
| Neural Attribute Machines for Program Generation | https://ml4code.github.io/publications/amodio2017neural/ |
| Abstract Syntax Networks for Code Generation and Semantic Parsing | https://ml4code.github.io/publications/rabinovich2017abstract/ |
| Synthesizing benchmarks for predictive modeling | https://ml4code.github.io/publications/cummins2017synthesizing/ |
| DeepFix: Fixing Common C Language Errors by Deep Learning | https://ml4code.github.io/publications/gupta2017deepfix/ |
| Deep Reinforcement Learning for Programming Language Correction | https://ml4code.github.io/publications/gupta2018deep/ |
| Bayesian Sketch Learning for Program Synthesis | https://ml4code.github.io/publications/murali2017bayesian/ |
| Compiler Fuzzing through Deep Learning | https://ml4code.github.io/publications/cummins2018compiler/ |
| Generating Regular Expressions from Natural Language Specifications: Are We There Yet? | https://ml4code.github.io/publications/zhong2018generating/ |
| Mapping Language to Code in Programmatic Context | https://ml4code.github.io/publications/iyer2018mapping/ |
| NL2Bash: A Corpus and Semantic Parser for Natural Language Interface to the Linux Operating System | https://ml4code.github.io/publications/lin2018nl2bash/ |
| CODIT: Code Editing with Tree-Based Neural Machine Translation | https://ml4code.github.io/publications/chakraborty2018tree2tree/ |
| A Retrieve-and-Edit Framework for Predicting Structured Outputs | https://ml4code.github.io/publications/hashimoto2018retrieve/ |
| Learning to Generate Corrective Patches using Neural Machine Translation | https://ml4code.github.io/publications/hata2018learning/ |
| Learning to Repair Software Vulnerabilities with Generative Adversarial Networks | https://ml4code.github.io/publications/harer2018learning/ |
| SampleFix: Learning to Correct Programs by Sampling Diverse Fixes | https://ml4code.github.io/publications/hajipour2019samplefix/ |
| A Grammar-Based Structural CNN Decoder for Code Generation | https://ml4code.github.io/publications/sun2019grammar/ |
| SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair | https://ml4code.github.io/publications/chen2019sequencer/ |
| Generative Code Modeling with Graphs | https://ml4code.github.io/publications/brockschmidt2019generative/ |
| Structural Language Models for Any-Code Generation | https://ml4code.github.io/publications/alon2019structural/ |
| Code Generation as a Dual Task of Code Summarization | https://ml4code.github.io/publications/wei2019code/ |
| DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing | https://ml4code.github.io/publications/liu2019deepfuzz/ |
| A case study on machine learning for synthesizing benchmarks | https://ml4code.github.io/publications/goens2019case/ |
| Learning Programmatic Idioms for Scalable Semantic Parsing | https://ml4code.github.io/publications/iyer2019learning/ |
| Incorporating External Knowledge through Pre-training for Natural Language to Code Generation | https://ml4code.github.io/publications/xu2020incorporating/ |
| Semantic Scaffolds for Pseudocode-to-Code Generation | https://ml4code.github.io/publications/zhong2020semantic/ |
| Unit Test Case Generation with Transformers | https://ml4code.github.io/publications/tufano2020unit/ |
| Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers | https://ml4code.github.io/publications/tufano2020generating/ |
| PyMT5: multi-mode translation of natural language and Python code with transformers | https://ml4code.github.io/publications/clement2020pymt5/ |
| IntelliCode Compose: Code Generation Using Transformer | https://ml4code.github.io/publications/svyatkovskiy2020intellicode/ |
| Retrieval Augmented Code Generation and Summarization | https://ml4code.github.io/publications/parvez2021retrieval/ |
| Energy-Based Models for Code Generation under Compilability Constraints | https://ml4code.github.io/publications/korbak2021energy/ |
| Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy | https://ml4code.github.io/publications/clement2021long/ |
| Time-Efficient Code Completion Model for the R Programming Language | https://ml4code.github.io/publications/popov2021time/ |
| Shellcode_IA32: A Dataset for Automatic Shellcode Generation | https://ml4code.github.io/publications/liguori2021shellcode_ia32/ |
| TOGA: A Neural Method for Test Oracle Generation | https://ml4code.github.io/publications/dinella2022toga/ |
| InCoder: A Generative Model for Code Infilling and Synthesis | https://ml4code.github.io/publications/fried2022incoder/ |
| DocCoder: Generating Code by Retrieving and Reading Docs | https://ml4code.github.io/publications/zhou2022docoder/ |
| Human perceiving behavior modeling in evaluation of code generation models | https://ml4code.github.io/publications/kovalchuk2022human/ |
| Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models | https://ml4code.github.io/publications/vaithilingam2022expectation/ |
| Open-ended Knowledge Tracing | https://ml4code.github.io/publications/liu2022open/ |
| Test-based and metric-based evaluation of code generation models for practical question answering | https://ml4code.github.io/publications/kovalchuk2023test/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| MISIM: An End-to-End Neural Code Similarity System | https://ml4code.github.io/publications/ye2020misim/ |
| Senatus - A Fast and Accurate Code-to-Code Recommendation Engine | https://ml4code.github.io/publications/silavong2022senatus/ |
| Cross-Language Binary-Source Code Matching with Intermediate Representations | https://ml4code.github.io/publications/gui2022cross/ |
| CV4Code: Sourcecode Understanding via Visual Code Representations | https://ml4code.github.io/publications/shi2022cv4code/ |
| Can Large Language Model Detect Plagiarism in Source Code? | https://ml4code.github.io/publications/brach2024can/ |
| DeepDelta: Learning to Repair Compilation Errors | https://ml4code.github.io/publications/mesbah2019deepdelta/ |
| A Neural Approach to Decompiled Identifier Renaming | https://ml4code.github.io/publications/lacomis2019neural/ |
| Static Neural Compiler Optimization via Deep Reinforcement Learning | https://ml4code.github.io/publications/mammadli2020static/ |
| ComPy-Learn: A toolbox for exploring machine learning representations for compilers | https://ml4code.github.io/publications/brauckmann2020compy/ |
| Compiler-based graph representations for deep learning models of code | https://ml4code.github.io/publications/brauckmann2020compiler/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation | https://ml4code.github.io/publications/zhang2023repocoder/ |
| RepoFusion: Training Code Models to Understand Your Repository | https://ml4code.github.io/publications/shrivastava2023repofusion/ |
| A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Tasks | https://ml4code.github.io/publications/casey2024survey/ |
| A parallel corpus of Python functions and documentation strings for automated code documentation and code generation | https://ml4code.github.io/publications/barone2017parallel/ |
| StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow | https://ml4code.github.io/publications/yao2018staqc/ |
| Learning to Mine Aligned Code and Natural Language Pairs from Stack Overflow | https://ml4code.github.io/publications/yin2018mining/ |
| Public Git Archive: a Big Code dataset for all | https://ml4code.github.io/publications/markovtsev2018public/ |
| CodeSearchNet Challenge: Evaluating the State of Semantic Code Search | https://ml4code.github.io/publications/husain2019codesearchnet/ |
| JuICe: A Large Scale Distantly Supervised Dataset for Open Domain Context-based Code Generation | https://ml4code.github.io/publications/agashe2019julce/ |
| Neural Code Search Evaluation Dataset | https://ml4code.github.io/publications/li2019neural/ |
| Recommendations for Datasets for Source Code Summarization | https://ml4code.github.io/publications/leclair2019recommendations/ |
| The Adverse Effects of Code Duplication in Machine Learning Models of Code | https://ml4code.github.io/publications/allamanis2019adverse/ |
| Graph4Code: A Machine Interpretable Knowledge Graph for Code | https://ml4code.github.io/publications/abdelaziz2020graph4code/ |
| Associating Natural Language Comment and Source Code Entities | https://ml4code.github.io/publications/panthaplackel2020associating/ |
| Code and Named Entity Recognition in StackOverflow | https://ml4code.github.io/publications/tabassum2020code/ |
| ProGraML: Graph-based Deep Learning for Program Optimization and Analysis | https://ml4code.github.io/publications/cummins2020programl/ |
| Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Size | https://ml4code.github.io/publications/monperrus2021megadiff/ |
| CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model | https://ml4code.github.io/publications/jung2021commitbert/ |
| CoSQA: 20,000+ Web Queries for Code Search and Question Answering | https://ml4code.github.io/publications/huang2021cosqa/ |
| ConTest: A Unit Test Completion Benchmark featuring Context | https://ml4code.github.io/publications/villmow2021contest/ |
| A large-scale benchmark for few-shot program induction and synthesis | https://ml4code.github.io/publications/alet2021largescale/ |
| Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation | https://ml4code.github.io/publications/orlanski2021reading/ |
| Time-Efficient Code Completion Model for the R Programming Language | https://ml4code.github.io/publications/popov2021time/ |
| ManyTypes4Py: A Benchmark Python Dataset for Machine Learning-based Type Inference | https://ml4code.github.io/publications/mir2021manytypes4py/ |
| Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks | https://ml4code.github.io/publications/puri2021project/ |
| Shellcode_IA32: A Dataset for Automatic Shellcode Generation | https://ml4code.github.io/publications/liguori2021shellcode_ia32/ |
| Impact of Evaluation Methodologies on Code Summarization | https://ml4code.github.io/publications/nie2021evaluation/ |
| Text-to-SQL in the Wild: A Naturally-Occurring Dataset Based on Stack Exchange Data | https://ml4code.github.io/publications/hazoom2021text/ |
| The Stack: 3TB of permissively licensed source code | https://ml4code.github.io/publications/kocetkov2022stack/ |
| Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions | https://ml4code.github.io/publications/bieber2022static/ |
| Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsing | https://ml4code.github.io/publications/patil2022exploring/ |
| JEMMA: An Extensible Java Dataset for ML4Code Applications | https://ml4code.github.io/publications/karmakar2022jemma/ |
| OctoPack: Instruction Tuning Code Large Language Models | https://ml4code.github.io/publications/muennighoff2023octopack/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection | https://ml4code.github.io/publications/chen2023diversevul/ |
| Learning to Align the Source Code to the Compiled Object Code | https://ml4code.github.io/publications/levy2017learning/ |
| Towards Neural Decompilation | https://ml4code.github.io/publications/katz2019towards/ |
| Coda: An End-to-End Neural Program Decompiler | https://ml4code.github.io/publications/fu2019coda/ |
| DIRECT : A Transformer-based Model for Decompiled Identifier Renaming | https://ml4code.github.io/publications/nitin2021direct/ |
| Code Translation with Compiler Representations | https://ml4code.github.io/publications/szafraniec2022code/ |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | https://ml4code.github.io/publications/tan2024llm4decompile/ |
| Using Web Corpus Statistics for Program Analysis | https://ml4code.github.io/publications/hsiao2014using/ |
| On the “Naturalness” of Buggy Code | https://ml4code.github.io/publications/ray2015naturalness/ |
| Bugram: bug detection with n-gram language models | https://ml4code.github.io/publications/wang2016bugram/ |
| Automatically Learning Semantic Features for Defect Prediction | https://ml4code.github.io/publications/wang2016automatically/ |
| Software Defect Prediction via Convolutional Neural Network | https://ml4code.github.io/publications/li2017software/ |
| Deep Learning to Find Bugs | https://ml4code.github.io/publications/pradel2017deep/ |
| Open Vocabulary Learning on Source Code with a Graph-Structured Cache | https://ml4code.github.io/publications/cvitkovic2018open/ |
| Learning to Represent Programs with Graphs | https://ml4code.github.io/publications/allamanis2018learning/ |
| Exploring the Naturalness of Buggy Code with Recurrent Neural Network | https://ml4code.github.io/publications/lanchantin2018exploring/ |
| Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networks | https://ml4code.github.io/publications/li2019improving/ |
| Scalable Taint Specification Inference with Big Code | https://ml4code.github.io/publications/chibotaru2019scalable/ |
| Neural Attribution for Semantic Bug-Localization in Student Programs | https://ml4code.github.io/publications/gupta2019neural/ |
| Learning Semantic Program Embeddings with Graph Interval Neural Network | https://ml4code.github.io/publications/wang2020learning/ |
| Global Relational Models of Source Code | https://ml4code.github.io/publications/hellendoorn2020global/ |
| OffSide: Learning to Identify Mistakes in Boundary Conditions | https://ml4code.github.io/publications/briem2020offside/ |
| SCELMo: Source Code Embeddings from Language Models | https://ml4code.github.io/publications/karampatsis2020scelmo/ |
| Self-Supervised Bug Detection and Repair | https://ml4code.github.io/publications/allamanis2021self/ |
| Co-Training for Commit Classification | https://ml4code.github.io/publications/lee2021cotraining/ |
| Deep Learning based Vulnerability Detection: Are We There Yet? | https://ml4code.github.io/publications/chakraborty2020deep/ |
| On Distribution Shift in Learning-based Bug Detectors | https://ml4code.github.io/publications/he2022distribution/ |
| Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions | https://ml4code.github.io/publications/bieber2022static/ |
| Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixes | https://ml4code.github.io/publications/richter2022can/ |
| Large Language Models and Simple, Stupid Bugs | https://ml4code.github.io/publications/jesse2023large/ |
| Predicting Program Properties from “Big Code” | https://ml4code.github.io/publications/raychev2015predicting/ |
| Statistical Deobfuscation of Android Applications | https://ml4code.github.io/publications/bichsel2016statistical/ |
| Towards Better Program Obfuscation: Optimization via Language Models | https://ml4code.github.io/publications/liu2016towards/ |
| Recovering Clear, Natural Identifiers from Obfuscated JS Names | https://ml4code.github.io/publications/vasilescu2017recovering/ |
| Recovering Variable Names for Minified Code with Usage Contexts | https://ml4code.github.io/publications/tran2019recovering/ |
| Neural Reverse Engineering of Stripped Binaries | https://ml4code.github.io/publications/david2019neural/ |
| A Neural Approach to Decompiled Identifier Renaming | https://ml4code.github.io/publications/lacomis2019neural/ |
| Natural Language Models for Predicting Programming Comments | https://ml4code.github.io/publications/movshovitz2013natural/ |
| A parallel corpus of Python functions and documentation strings for automated code documentation and code generation | https://ml4code.github.io/publications/barone2017parallel/ |
| Learning Technical Correspondences in Technical Documentation | https://ml4code.github.io/publications/richardson2017learning/ |
| Deep Learning to Detect Redundant Method Comments | https://ml4code.github.io/publications/louis2018deep/ |
| Improving Automatic Source Code Summarization via Deep Reinforcement Learning | https://ml4code.github.io/publications/wan2018improving/ |
| Structured Neural Summarization | https://ml4code.github.io/publications/fernandes2019structured/ |
| A Neural Model for Generating Natural Language Summaries of Program Subroutines | https://ml4code.github.io/publications/leclair2019neural/ |
| TAG : Type Auxiliary Guiding for Code Comment Generation | https://ml4code.github.io/publications/cai2020tag/ |
| TranS^3: A Transformer-based Framework for Unifying Code Summarization and Code Search | https://ml4code.github.io/publications/wang2020trans/ |
| Deep Just-In-Time Inconsistency Detection Between Comments and Source Code | https://ml4code.github.io/publications/panthaplackel2020deep/ |
| Code to Comment "Translation": Data, Metrics, Baselining & Evaluation | https://ml4code.github.io/publications/gros2020code/ |
| Learning to Update Natural Language Comments Based on Code Changes | https://ml4code.github.io/publications/panthaplackel2020learning/ |
| PyMT5: multi-mode translation of natural language and Python code with transformers | https://ml4code.github.io/publications/clement2020pymt5/ |
| NaturalCC: A Toolkit to Naturalize the Source Code Corpus | https://ml4code.github.io/publications/wan2020naturalcc/ |
| Where should I comment my code? A dataset and model for predicting locations that need comments | https://ml4code.github.io/publications/louis2020where/ |
| Suggesting Comment Completions for Python using Neural Language Models | https://ml4code.github.io/publications/ciurumelea2020suggesting/ |
| Automating Just-In-Time Comment Updating | https://ml4code.github.io/publications/liu2020automating/ |
| Learning to Describe Solutions for Bug Reports Based on Developer Discussions | https://ml4code.github.io/publications/panthaplackel2021learning/ |
| Assemble Foundation Models for Automatic Code Summarization | https://ml4code.github.io/publications/jian2022assemble/ |
| LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognition | https://ml4code.github.io/publications/sharma2022lamner/ |
| Learning Scalable and Precise Representation of Program Semantics | https://ml4code.github.io/publications/wang2019learning/ |
| Blended, precise semantic program embeddings | https://ml4code.github.io/publications/wang2020blended/ |
| Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks | https://ml4code.github.io/publications/bieber2020learning/ |
| TraceFixer: Execution Trace-Driven Program Repair | https://ml4code.github.io/publications/bouzenia2023tracefixer/ |
| Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learning | https://ml4code.github.io/publications/yadavally2024predictive/ |
| A Study of Repetitiveness of Code Changes in Software Evolution | https://ml4code.github.io/publications/nguyen2013study/ |
| Automatically Generating Commit Messages from Diffs using Neural Machine Translation | https://ml4code.github.io/publications/jiang2017automatically/ |
| A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes | https://ml4code.github.io/publications/loyola2017neural/ |
| Content Aware Source Code Change Description Generation | https://ml4code.github.io/publications/loyola2018content/ |
| Learning How to Mutate Source Code from Bug-Fixes | https://ml4code.github.io/publications/tufano2018learning/ |
| Neural-Machine-Translation-Based Commit Message Generation: How Far Are We? | https://ml4code.github.io/publications/liu2018neural/ |
| Graph-based Mining of In-the-Wild, Fine-grained, Semantic Code Change Patterns | https://ml4code.github.io/publications/nguyen2019graph/ |
| On Learning Meaningful Code Changes via Neural Machine Translation | https://ml4code.github.io/publications/tufano2019learning/ |
| Learning to Fix Build Errors with Graph2Diff Neural Networks | https://ml4code.github.io/publications/tarlow2019learning/ |
| Generating commit messages from diffs using pointer-generator network | https://ml4code.github.io/publications/liu2019generating/ |
| Commit Message Generation for Source Code Changes | https://ml4code.github.io/publications/xu2019commit/ |
| DeepDelta: Learning to Repair Compilation Errors | https://ml4code.github.io/publications/mesbah2019deepdelta/ |
| Commit2Vec: Learning Distributed Representations of Code Changes | https://ml4code.github.io/publications/commit2vec2019lozoya/ |
| Learning to Represent Edits | https://ml4code.github.io/publications/yin2019learning/ |
| Neural Networks for Modeling Source Code Edits | https://ml4code.github.io/publications/zhao2019neural/ |
| DLFix: Context-based Code Transformation Learning for Automated Program Repair | https://ml4code.github.io/publications/li2020dlfix/ |
| Hoppity: Learning Bug Detection and Repair | https://ml4code.github.io/publications/dinella2020hoppity/ |
| CC2Vec: Distributed Representations of Code Changes | https://ml4code.github.io/publications/hoang2020cc2vec/ |
| Graph-based, Self-Supervised Program Repair from Diagnostic Feedback | https://ml4code.github.io/publications/yasunaga2020graph/ |
| Copy that! Editing Sequences by Copying Spans | https://ml4code.github.io/publications/panthaplackel2020copy/ |
| Deep Just-In-Time Inconsistency Detection Between Comments and Source Code | https://ml4code.github.io/publications/panthaplackel2020deep/ |
| A Structural Model for Contextual Code Changes | https://ml4code.github.io/publications/brody2020structural/ |
| Learning to Update Natural Language Comments Based on Code Changes | https://ml4code.github.io/publications/panthaplackel2020learning/ |
| Unsupervised Learning of General-Purpose Embeddings for Code Changes | https://ml4code.github.io/publications/pravilov2021unsupervised/ |
| Megadiff: A Dataset of 600k Java Source Code Changes Categorized by Diff Size | https://ml4code.github.io/publications/monperrus2021megadiff/ |
| A Semantic Bug Seeding: A Learning-Based Approach for Creating Realistic Bugs | https://ml4code.github.io/publications/patra2021semantic/ |
| Jointly Learning to Repair Code and Generate Commit Message | https://ml4code.github.io/publications/bai2021jointly/ |
| DeepMerge: Learning to Merge Programs | https://ml4code.github.io/publications/dinella2021deepmerge/ |
| On Multi-Modal Learning of Editing Source Code | https://ml4code.github.io/publications/chakraborty2021multimodal/ |
| A Syntax-Guided Edit Decoder for Neural Program Repair | https://ml4code.github.io/publications/zhu2921syntax/ |
| Learning to Model Editing Processes | https://ml4code.github.io/publications/reid2022learning/ |
| CoditT5: Pretraining for Source Code and Natural Language Editing | https://ml4code.github.io/publications/zhang2022coditt5/ |
| Grace: Language Models Meet Code Edits | https://ml4code.github.io/publications/gupta2023grace/ |
| Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions | https://ml4code.github.io/publications/cassano2023can/ |
| A system to grade computer programming skills using machine learning | https://ml4code.github.io/publications/srikant2014system/ |
| Learning Program Embeddings to Propagate Feedback on Student Code | https://ml4code.github.io/publications/piech2015learning/ |
| Question Independent Grading using Machine Learning: The Case of Computer Program Grading | https://ml4code.github.io/publications/singh2016question/ |
| ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback | https://ml4code.github.io/publications/wu2021prototransformer/ |
| Open-ended Knowledge Tracing | https://ml4code.github.io/publications/liu2022open/ |
| Testing Neural Program Analyzers | https://ml4code.github.io/publications/rabin2019testing/ |
| The Adverse Effects of Code Duplication in Machine Learning Models of Code | https://ml4code.github.io/publications/allamanis2019adverse/ |
| Towards Demystifying Dimensions of Source Code Embeddings | https://ml4code.github.io/publications/rabin2020demystifying/ |
| CodeBLEU: a Method for Automatic Evaluation of Code Synthesis | https://ml4code.github.io/publications/ren2020codebleu/ |
| On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations | https://ml4code.github.io/publications/rabin2021generalizability/ |
| Impact of Evaluation Methodologies on Code Summarization | https://ml4code.github.io/publications/nie2021evaluation/ |
| Memorization and Generalization in Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022memorization/ |
| An Extensive Study on Pre-trained Models for Program Understanding and Generation | https://ml4code.github.io/publications/zeng2022extensive/ |
| Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis | https://ml4code.github.io/publications/naik2022probing/ |
| Semantic Similarity Metrics for Evaluating Source Code Summarization | https://ml4code.github.io/publications/haque2022semantic/ |
| Human perceiving behavior modeling in evaluation of code generation models | https://ml4code.github.io/publications/kovalchuk2022human/ |
| Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsing | https://ml4code.github.io/publications/patil2022exploring/ |
| Natural Language to Code Generation in Interactive Data Science Notebooks | https://ml4code.github.io/publications/yin2022natural/ |
| CrystalBLEU: Precisely and Efficiently Measuring the Similarity of Code | https://ml4code.github.io/publications/eghbali2022crystalbleu/ |
| Productivity Assessment of Neural Code Completion | https://ml4code.github.io/publications/ziegler2022productivity/ |
| CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code | https://ml4code.github.io/publications/zhou2022codebertscore/ |
| Test-based and metric-based evaluation of code generation models for practical question answering | https://ml4code.github.io/publications/kovalchuk2023test/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| CodeScore: Evaluating Code Generation by Learning Code Execution | https://ml4code.github.io/publications/dong2023codescore/ |
| PPM: Automated Generation of Diverse Programming Problems for Benchmarking Code Generation Models | https://ml4code.github.io/publications/chen2024ppm/ |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | https://ml4code.github.io/publications/tan2024llm4decompile/ |
| Learning to Execute | https://ml4code.github.io/publications/zaremba2014learning/ |
| Show Your Work: Scratchpads for Intermediate Computation with Language Models | https://ml4code.github.io/publications/nye2021show/ |
| SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics | https://ml4code.github.io/publications/ye2022selfapr/ |
| CodeT: Code Generation with Generated Tests | https://ml4code.github.io/publications/chen2022codet/ |
| Code Execution with Pre-trained Language Models | https://ml4code.github.io/publications/liu2023code/ |
| LExecutor: Learning-Guided Execution | https://ml4code.github.io/publications/souza2023lexecutor/ |
| Exploring the Use of Deep Learning for Feature Location | https://ml4code.github.io/publications/corley2015exploring/ |
| Learning to Fuzz: Application-Independent Fuzz Testing with Probabilistic, Generative Models of Input Data | https://ml4code.github.io/publications/patra2016learning/ |
| Compiler Fuzzing through Deep Learning | https://ml4code.github.io/publications/cummins2018compiler/ |
| NEUZZ: Efficient Fuzzing with Neural Program Smoothing | https://ml4code.github.io/publications/she2019neuzz/ |
| DeepFuzz: Automatic Generation of Syntax Valid C Programs for Fuzz Testing | https://ml4code.github.io/publications/liu2019deepfuzz/ |
| Learning to Fuzz from Symbolic Execution with Application to Smart Contracts | https://ml4code.github.io/publications/he2019learning/ |
| Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzer | https://ml4code.github.io/publications/lee2020montage/ |
| Universal Fuzzing via Large Language Models | https://ml4code.github.io/publications/xia2023universal/ |
| On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations | https://ml4code.github.io/publications/rabin2021generalizability/ |
| Memorization and Generalization in Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022memorization/ |
| Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsing | https://ml4code.github.io/publications/patil2022exploring/ |
| Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation | https://ml4code.github.io/publications/li2023think/ |
| Gated Graph Sequence Neural Networks | https://ml4code.github.io/publications/li2016gated/ |
| Open Vocabulary Learning on Source Code with a Graph-Structured Cache | https://ml4code.github.io/publications/cvitkovic2018open/ |
| Learning to Represent Programs with Graphs | https://ml4code.github.io/publications/allamanis2018learning/ |
| Simulating Execution Time of Tensor Programs using Graph Neural Networks | https://ml4code.github.io/publications/tomczak2019simulating/ |
| Generative Code Modeling with Graphs | https://ml4code.github.io/publications/brockschmidt2019generative/ |
| Structured Neural Summarization | https://ml4code.github.io/publications/fernandes2019structured/ |
| Neural Reverse Engineering of Stripped Binaries | https://ml4code.github.io/publications/david2019neural/ |
| Using GGNN to recommend log statement level | https://ml4code.github.io/publications/li2019using/ |
| Program Classification Using Gated Graph Attention Neural Network for Online Programming Service | https://ml4code.github.io/publications/lu2019program/ |
| AutoPandas: neural-backed generators for program synthesis | https://ml4code.github.io/publications/bavishi2019autopandas/ |
| Learning to Fuzz from Symbolic Execution with Application to Smart Contracts | https://ml4code.github.io/publications/he2019learning/ |
| Inferring Javascript types using Graph Neural Networks | https://ml4code.github.io/publications/schrouff2019inferring/ |
| Learning Semantic Program Embeddings with Graph Interval Neural Network | https://ml4code.github.io/publications/wang2020learning/ |
| Global Relational Models of Source Code | https://ml4code.github.io/publications/hellendoorn2020global/ |
| Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks | https://ml4code.github.io/publications/zhou2019devign/ |
| LambdaNet: Probabilistic Type Inference using Graph Neural Networks | https://ml4code.github.io/publications/wei2020lambdanet/ |
| Graph-based, Self-Supervised Program Repair from Diagnostic Feedback | https://ml4code.github.io/publications/yasunaga2020graph/ |
| Typilus: Neural Type Hints | https://ml4code.github.io/publications/allamanis2020typilus/ |
| Learning Graph Structure With A Finite-State Automaton Layer | https://ml4code.github.io/publications/johnson2020learning/ |
| ProGraML: Graph-based Deep Learning for Program Optimization and Analysis | https://ml4code.github.io/publications/cummins2020programl/ |
| Towards Learning Representations of Binary Executable Files for Security Tasks | https://ml4code.github.io/publications/arakelyan2020towards/ |
| funcGNN: A Graph Neural Network Approach to Program Similarity | https://ml4code.github.io/publications/nair2020funcgnn/ |
| Deep Graph Matching and Searching for Semantic Code Retrieval | https://ml4code.github.io/publications/ling2020deep/ |
| ComPy-Learn: A toolbox for exploring machine learning representations for compilers | https://ml4code.github.io/publications/brauckmann2020compy/ |
| Compiler-based graph representations for deep learning models of code | https://ml4code.github.io/publications/brauckmann2020compiler/ |
| Detecting Code Clones with Graph Neural Network and Flow-Augmented Abstract Syntax Tree | https://ml4code.github.io/publications/wang2020detecting/ |
| Modeling Functional Similarity in Source Code with Graph-Based Siamese Networks | https://ml4code.github.io/publications/mehrotra2020modeling/ |
| Learning to Represent Programs with Heterogeneous Graphs | https://ml4code.github.io/publications/wang2020learning2/ |
| Self-Supervised Bug Detection and Repair | https://ml4code.github.io/publications/allamanis2021self/ |
| Structured Statistical Syntax Tree Prediction | https://ml4code.github.io/publications/omar2013structured/ |
| Building Program Vector Representations for Deep Learning | https://ml4code.github.io/publications/mou2014building/ |
| Structured Generative Models of Natural Source Code | https://ml4code.github.io/publications/maddison2014structured/ |
| Mining Idioms from Source Code | https://ml4code.github.io/publications/allamanis2014mining/ |
| Learning to Generate Pseudo-code from Source Code using Statistical Machine Translation | https://ml4code.github.io/publications/oda2015learning/ |
| A Bimodal Modelling of Source Code and Natural Language | https://ml4code.github.io/publications/allamanis2015bimodal/ |
| Learning Programs from Noisy Data | https://ml4code.github.io/publications/raychev2016learning/ |
| PHOG: Probabilistic Model for Code | https://ml4code.github.io/publications/bielik2016phog/ |
| Convolutional Neural Networks over Tree Structures for Programming Language Processing | https://ml4code.github.io/publications/mou2016convolutional/ |
| A Syntactic Neural Model for General-Purpose Code Generation | https://ml4code.github.io/publications/yin2017syntactic/ |
| Neural Attribute Machines for Program Generation | https://ml4code.github.io/publications/amodio2017neural/ |
| Abstract Syntax Networks for Code Generation and Semantic Parsing | https://ml4code.github.io/publications/rabinovich2017abstract/ |
| Mining Semantic Loop Idioms from Big Code | https://ml4code.github.io/publications/allamanis2017mining/ |
| Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks | https://ml4code.github.io/publications/bui2018cross/ |
| CODIT: Code Editing with Tree-Based Neural Machine Translation | https://ml4code.github.io/publications/chakraborty2018tree2tree/ |
| A Grammar-Based Structural CNN Decoder for Code Generation | https://ml4code.github.io/publications/sun2019grammar/ |
| Capturing source code semantics via tree-based convolution over API-enhanced AST | https://ml4code.github.io/publications/chen2019capturing/ |
| Generative Code Modeling with Graphs | https://ml4code.github.io/publications/brockschmidt2019generative/ |
| PathMiner : A Library for Mining of Path-Based Representations of Code | https://ml4code.github.io/publications/kovalenko2019pathminer/ |
| Learning Programmatic Idioms for Scalable Semantic Parsing | https://ml4code.github.io/publications/iyer2019learning/ |
| Automatic Source Code Summarization with Extended Tree-LSTM | https://ml4code.github.io/publications/shido2019automatic/ |
| Learning-based Recursive Aggregation of Abstract Syntax Trees for Code Clone Detection | https://ml4code.github.io/publications/buech2019learning/ |
| A Novel Neural Source Code Representation based on Abstract Syntax Tree | https://ml4code.github.io/publications/zhang2019novel/ |
| Neural-Network Guided Expression Transformation | https://ml4code.github.io/publications/edelmann2019neural/ |
| DLFix: Context-based Code Transformation Learning for Automated Program Repair | https://ml4code.github.io/publications/li2020dlfix/ |
| Modular Tree Network for Source Code Representation Learning | https://ml4code.github.io/publications/wang2020modular/ |
| PSCS: A Path-based Neural Model for Semantic Code Search | https://ml4code.github.io/publications/sun2020pscs/ |
| A Structural Model for Contextual Code Changes | https://ml4code.github.io/publications/brody2020structural/ |
| Predicting Vulnerability in Large Codebases With Deep Code Representation | https://ml4code.github.io/publications/ashwath2020predicting/ |
| TreeBERT: A Tree-Based Pre-Trained Model for Programming Language | https://ml4code.github.io/publications/jiang2021treebert/ |
| Learning to Complete Code with Sketches | https://ml4code.github.io/publications/guo2022learning/ |
| Grounded Copilot: How Programmers Interact with Code-Generating Models | https://ml4code.github.io/publications/barke2022grounded/ |
| Semantic Similarity Metrics for Evaluating Source Code Summarization | https://ml4code.github.io/publications/haque2022semantic/ |
| Human perceiving behavior modeling in evaluation of code generation models | https://ml4code.github.io/publications/kovalchuk2022human/ |
| Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models | https://ml4code.github.io/publications/vaithilingam2022expectation/ |
| What is it like to program with artificial intelligence? | https://ml4code.github.io/publications/sarkar2022what/ |
| Productivity Assessment of Neural Code Completion | https://ml4code.github.io/publications/ziegler2022productivity/ |
| A Hidden Markov Model to Detect Coded Information Islands in Free Text | https://ml4code.github.io/publications/cerulo2013hidden/ |
| Irish: A Hidden Markov Model to detect coded information islands in free text | https://ml4code.github.io/publications/cerulo2015irish/ |
| NIRMAL: Automatic Identification of Software Relevant Tweets Leveraging Language Model | https://ml4code.github.io/publications/sharma2015nirmal/ |
| Extracting Code from Programming Tutorial Videos | https://ml4code.github.io/publications/yadid2016extracting/ |
| A Deep Learning Approach to Identifying Source Code in Images and Video | https://ml4code.github.io/publications/ott2018deep/ |
| Evaluation of Type Inference with Textual Cues | https://ml4code.github.io/publications/shirani2018evaluation/ |
| Code and Named Entity Recognition in StackOverflow | https://ml4code.github.io/publications/tabassum2020code/ |
| Understanding Neural Code Intelligence Through Program Simplification | https://ml4code.github.io/publications/rabin2021understanding/ |
| OctoPack: Instruction Tuning Code Large Language Models | https://ml4code.github.io/publications/muennighoff2023octopack/ |
| Towards Demystifying Dimensions of Source Code Embeddings | https://ml4code.github.io/publications/rabin2020demystifying/ |
| Understanding Neural Code Intelligence Through Program Simplification | https://ml4code.github.io/publications/rabin2021understanding/ |
| Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022understanding/ |
| Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis | https://ml4code.github.io/publications/naik2022probing/ |
| An Exploratory Study on Code Attention in BERT | https://ml4code.github.io/publications/sharma2022exploratory/ |
| On the Naturalness of Software | https://ml4code.github.io/publications/hindle2012naturalness/ |
| A Statistical Semantic Language Model for Source Code | https://ml4code.github.io/publications/nguyen2013statistical/ |
| Mining Source Code Repositories at Massive Scale Using Language Modeling | https://ml4code.github.io/publications/allamanis2013mining/ |
| Structured Statistical Syntax Tree Prediction | https://ml4code.github.io/publications/omar2013structured/ |
| Learning Natural Coding Conventions | https://ml4code.github.io/publications/allamanis2014learning/ |
| Structured Generative Models of Natural Source Code | https://ml4code.github.io/publications/maddison2014structured/ |
| Code Completion with Statistical Language Models | https://ml4code.github.io/publications/raychev2014code/ |
| On the Localness of Software | https://ml4code.github.io/publications/tu2014localness/ |
| Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Models | https://ml4code.github.io/publications/campbell2014syntax/ |
| Will they like this? Evaluating Code Contributions With Language Models | https://ml4code.github.io/publications/hellendoorn2015will/ |
| Graph-based Statistical Language Model for Code | https://ml4code.github.io/publications/nguyen2015graph/ |
| Products, Developers, and Milestones: How Should I Build My N-Gram Language Model | https://ml4code.github.io/publications/saraiva2015products/ |
| Visualizing and Understanding Recurrent Networks | https://ml4code.github.io/publications/karpathy2015visualizing/ |
| CACHECA: A Cache Language Model Based Code Suggestion Tool | https://ml4code.github.io/publications/franks2015cacheca/ |
| A deep language model for software code | https://ml4code.github.io/publications/dam2016deep/ |
| PHOG: Probabilistic Model for Code | https://ml4code.github.io/publications/bielik2016phog/ |
| Learning Python Code Suggestion with a Sparse Pointer Network | https://ml4code.github.io/publications/bhoopchand2016learning/ |
| A Language Model for Statements of Software Code | https://ml4code.github.io/publications/yang2017language/ |
| Are Deep Neural Networks the Best Choice for Modeling Source Code? | https://ml4code.github.io/publications/hellendoorn2017deep/ |
| Code Completion with Neural Attention and Pointer Networks | https://ml4code.github.io/publications/li2017code/ |
| Building Language Models for Text with Named Entities | https://ml4code.github.io/publications/parvez2018building/ |
| Exploring the Naturalness of Buggy Code with Recurrent Neural Network | https://ml4code.github.io/publications/lanchantin2018exploring/ |
| Syntax and Sensibility: Using language models to detect and correct syntax errors | https://ml4code.github.io/publications/santos2018syntax/ |
| On the Impact of Refactoring Operations on Code Naturalness | https://ml4code.github.io/publications/lin2019impact/ |
| Pythia: AI-assisted Code Completion System | https://ml4code.github.io/publications/svyatkovskiy2019pythia/ |
| Maybe Deep Neural Networks are the Best Choice for Modeling Source Code | https://ml4code.github.io/publications/karampatsis2019deep/ |
| Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code | https://ml4code.github.io/publications/karampatsis2020big/ |
| PyMT5: multi-mode translation of natural language and Python code with transformers | https://ml4code.github.io/publications/clement2020pymt5/ |
| Montage: A Neural Network Language Model-Guided JavaScript Engine Fuzzer | https://ml4code.github.io/publications/lee2020montage/ |
| IntelliCode Compose: Code Generation Using Transformer | https://ml4code.github.io/publications/svyatkovskiy2020intellicode/ |
| On-the-Fly Adaptation of Source Code Models using Meta-Learning | https://ml4code.github.io/publications/shrivastava2020on-the-fly/ |
| CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model | https://ml4code.github.io/publications/jung2021commitbert/ |
| Evaluating Large Language Models Trained on Code | https://ml4code.github.io/publications/chen2021evaluating/ |
| Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models | https://ml4code.github.io/publications/li2021toward/ |
| An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions | https://ml4code.github.io/publications/pearce2021empirical/ |
| Capturing Structural Locality in Non-parametric Language Models | https://ml4code.github.io/publications/xu2021capturing/ |
| Exploration of Convolutional Neural Network models for source code classification | https://ml4code.github.io/publications/barchi2021exploration/ |
| Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy | https://ml4code.github.io/publications/clement2021long/ |
| Time-Efficient Code Completion Model for the R Programming Language | https://ml4code.github.io/publications/popov2021time/ |
| On the Naturalness and Localness of Software Logs | https://ml4code.github.io/publications/gholamian2021naturalness/ |
| Neural Program Generation Modulo Static Analysis | https://ml4code.github.io/publications/mukherjee2021neural/ |
| Memorization and Generalization in Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022memorization/ |
| Efficient Training of Language Models to Fill in the Middle | https://ml4code.github.io/publications/bavarian2022efficient/ |
| Assemble Foundation Models for Automatic Code Summarization | https://ml4code.github.io/publications/jian2022assemble/ |
| A Systematic Evaluation of Large Language Models of Code | https://ml4code.github.io/publications/xu2022systematic/ |
| Synchromesh: Reliable code generation from pre-trained language models | https://ml4code.github.io/publications/poesia2022synchromesh/ |
| Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mapping | https://ml4code.github.io/publications/parisi2022making/ |
| Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding | https://ml4code.github.io/publications/deze2022bridging/ |
| Learning to Complete Code with Sketches | https://ml4code.github.io/publications/guo2022learning/ |
| Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis | https://ml4code.github.io/publications/naik2022probing/ |
| LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognition | https://ml4code.github.io/publications/sharma2022lamner/ |
| An Exploratory Study on Code Attention in BERT | https://ml4code.github.io/publications/sharma2022exploratory/ |
| Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models | https://ml4code.github.io/publications/vaithilingam2022expectation/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippets | https://ml4code.github.io/publications/lomshakov2023fine/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| (Partial) Program Dependence Learning | https://ml4code.github.io/publications/yadavally2023partial/ |
| Can Large Language Model Detect Plagiarism in Source Code? | https://ml4code.github.io/publications/brach2024can/ |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | https://ml4code.github.io/publications/tan2024llm4decompile/ |
| Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search | https://ml4code.github.io/publications/li2024rewriting/ |
| A Learning-Based Approach to Static Program Slicing | https://ml4code.github.io/publications/yadavally2024learning/ |
| Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learning | https://ml4code.github.io/publications/yadavally2024predictive/ |
| A Static Evaluation of Code Completion by Large Language Models | https://ml4code.github.io/publications/ding2023static/ |
| Can Large Language Model Detect Plagiarism in Source Code? | https://ml4code.github.io/publications/brach2024can/ |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | https://ml4code.github.io/publications/tan2024llm4decompile/ |
| Using GGNN to recommend log statement level | https://ml4code.github.io/publications/li2019using/ |
| On the Naturalness and Localness of Software Logs | https://ml4code.github.io/publications/gholamian2021naturalness/ |
| Using Deep Learning to Generate Complete Log Statements | https://ml4code.github.io/publications/mastropaolo2022using/ |
| Memorization and Generalization in Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022memorization/ |
| Test-based and metric-based evaluation of code generation models for practical question answering | https://ml4code.github.io/publications/kovalchuk2023test/ |
| Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search | https://ml4code.github.io/publications/li2024rewriting/ |
| Lexical Statistical Machine Translation for Language Migration | https://ml4code.github.io/publications/nguyen2013lexical/ |
| Statistical Learning Approach for Mining API Usage Mappings for Code Migration | https://ml4code.github.io/publications/nguyen2014statistical/ |
| Divide-and-Conquer Approach for Multi-phase Statistical Migration for Source Code | https://ml4code.github.io/publications/nguyen2015divide/ |
| Phrase-Based Statistical Translation of Programming Languages | https://ml4code.github.io/publications/karaivanov2014phrase/ |
| Using Machine Translation for Converting Python 2 to Python 3 Code | https://ml4code.github.io/publications/aggarwal2015using/ |
| Mapping API Elements for Code Migration with Vector Representations | https://ml4code.github.io/publications/nguyen2016mapping/ |
| Unsupervised Translation of Programming Languages | https://ml4code.github.io/publications/lachaux2020unsupervised/ |
| Leveraging Automated Unit Tests for Unsupervised Code Translation | https://ml4code.github.io/publications/roziere2021leveraging/ |
| Code Translation with Compiler Representations | https://ml4code.github.io/publications/szafraniec2022code/ |
| Learning Natural Coding Conventions | https://ml4code.github.io/publications/allamanis2014learning/ |
| Predicting Program Properties from “Big Code” | https://ml4code.github.io/publications/raychev2015predicting/ |
| Suggesting Accurate Method and Class Names | https://ml4code.github.io/publications/allamanis2015suggesting/ |
| A Convolutional Attention Network for Extreme Summarization of Source Code | https://ml4code.github.io/publications/allamanis2016convolutional/ |
| Statistical Deobfuscation of Android Applications | https://ml4code.github.io/publications/bichsel2016statistical/ |
| Recovering Clear, Natural Identifiers from Obfuscated JS Names | https://ml4code.github.io/publications/vasilescu2017recovering/ |
| Context2Name: A Deep Learning-Based Approach to Infer Natural Variable Names from Usage Contexts | https://ml4code.github.io/publications/bavishi2017context2name/ |
| Learning to Represent Programs with Graphs | https://ml4code.github.io/publications/allamanis2018learning/ |
| A General Path-Based Representation for Predicting Program Properties | https://ml4code.github.io/publications/alon2018general/ |
| code2vec: Learning Distributed Representations of Code | https://ml4code.github.io/publications/alon2019code2vec/ |
| A Neural Model for Method Name Generation from Functional Description | https://ml4code.github.io/publications/gao2019neural/ |
| Recovering Variable Names for Minified Code with Usage Contexts | https://ml4code.github.io/publications/tran2019recovering/ |
| Mercem: Method Name Recommendation Based on Call Graph Embedding | https://ml4code.github.io/publications/yonai2019mercem/ |
| Learning to Sport and Refactor Inconsistent Method Names | https://ml4code.github.io/publications/liu2019learning/ |
| code2seq: Generating Sequences from Structured Representations of Code | https://ml4code.github.io/publications/alon2018code2seq/ |
| Method name suggestion with hierarchical attention networks | https://ml4code.github.io/publications/xu2019method/ |
| Neural Reverse Engineering of Stripped Binaries | https://ml4code.github.io/publications/david2019neural/ |
| A Neural Approach to Decompiled Identifier Renaming | https://ml4code.github.io/publications/lacomis2019neural/ |
| Suggesting Natural Method Names to Check Name Consistencies | https://ml4code.github.io/publications/nguyen2020suggesting/ |
| Towards Demystifying Dimensions of Source Code Embeddings | https://ml4code.github.io/publications/rabin2020demystifying/ |
| Embedding Java Classes with code2vec: Improvements from Variable Obfuscation | https://ml4code.github.io/publications/compton2020embedding/ |
| Semantic Robustness of Models of Source Code | https://ml4code.github.io/publications/henkel2020semantic/ |
| InCoder: A Generative Model for Code Infilling and Synthesis | https://ml4code.github.io/publications/fried2022incoder/ |
| Test-based and metric-based evaluation of code generation models for practical question answering | https://ml4code.github.io/publications/kovalchuk2023test/ |
| Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IR | https://ml4code.github.io/publications/barchi2019code/ |
| Test-based and metric-based evaluation of code generation models for practical question answering | https://ml4code.github.io/publications/kovalchuk2023test/ |
| Can Large Language Model Detect Plagiarism in Source Code? | https://ml4code.github.io/publications/brach2024can/ |
| Natural Language to Code Generation in Interactive Data Science Notebooks | https://ml4code.github.io/publications/yin2022natural/ |
| End-to-end Deep Learning of Optimization Heuristics | https://ml4code.github.io/publications/cummins2017end/ |
| Synthesizing benchmarks for predictive modeling | https://ml4code.github.io/publications/cummins2017synthesizing/ |
| Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IR | https://ml4code.github.io/publications/barchi2019code/ |
| Neural-Network Guided Expression Transformation | https://ml4code.github.io/publications/edelmann2019neural/ |
| ComPy-Learn: A toolbox for exploring machine learning representations for compilers | https://ml4code.github.io/publications/brauckmann2020compy/ |
| Compiler-based graph representations for deep learning models of code | https://ml4code.github.io/publications/brauckmann2020compiler/ |
| Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models | https://ml4code.github.io/publications/li2021toward/ |
| Exploration of Convolutional Neural Network models for source code classification | https://ml4code.github.io/publications/barchi2021exploration/ |
| Source Code Classification for Energy Efficiency in Parallel Ultra Low-Power Microcontrollers | https://ml4code.github.io/publications/parisi2021source/ |
| Deep Learning Approaches to Source Code Analysis for Optimization of Heterogeneous Systems: Recent Results, Challenges and Opportunities | https://ml4code.github.io/publications/barchi2022deep/ |
| Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mapping | https://ml4code.github.io/publications/parisi2022making/ |
| DeepPERF: A Deep Learning-Based Approach For Improving Software Performance | https://ml4code.github.io/publications/garg2022deepperf/ |
| Supersonic: Learning to Generate Source Code Optimizations in C/C++ | https://ml4code.github.io/publications/chen2023supersonic/ |
| Rethinking Negative Pairs in Code Search | https://ml4code.github.io/publications/li2023rethinking/ |
| Mining Idioms from Source Code | https://ml4code.github.io/publications/allamanis2014mining/ |
| KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts | https://ml4code.github.io/publications/movshovitz2015kb/ |
| Parameter-Free Probabilistic API Mining across GitHub | https://ml4code.github.io/publications/fowkes2016parameter/ |
| Mining Semantic Loop Idioms from Big Code | https://ml4code.github.io/publications/allamanis2017mining/ |
| Topic modeling of public repositories at scale using names in source code | https://ml4code.github.io/publications/markovtsev2017topic/ |
| Graph-based Mining of In-the-Wild, Fine-grained, Semantic Code Change Patterns | https://ml4code.github.io/publications/nguyen2019graph/ |
| Learning Programmatic Idioms for Scalable Semantic Parsing | https://ml4code.github.io/publications/iyer2019learning/ |
| Mining Idioms in the Wild | https://ml4code.github.io/publications/sivaraman2021mining/ |
| Can Large Language Model Detect Plagiarism in Source Code? | https://ml4code.github.io/publications/brach2024can/ |
| Deep Transfer Learning for Source Code Modeling | https://ml4code.github.io/publications/hussain2019deep/ |
| GraphCodeBERT: Pre-training Code Representations with Data Flow | https://ml4code.github.io/publications/guo2020graphcodebert/ |
| PyMT5: multi-mode translation of natural language and Python code with transformers | https://ml4code.github.io/publications/clement2020pymt5/ |
| IntelliCode Compose: Code Generation Using Transformer | https://ml4code.github.io/publications/svyatkovskiy2020intellicode/ |
| Pre-trained Contextual Embedding of Source Code | https://ml4code.github.io/publications/kanade2020pretrained/ |
| CodeBERT: A Pre-Trained Model for Programming and Natural Languages | https://ml4code.github.io/publications/feng2020codebert/ |
| Contrastive Code Representation Learning | https://ml4code.github.io/publications/jain2020contrastive/ |
| SCELMo: Source Code Embeddings from Language Models | https://ml4code.github.io/publications/karampatsis2020scelmo/ |
| Contrastive Learning for Source Code with Structural and Functional Properties | https://ml4code.github.io/publications/ding2021contrastive/ |
| DOBF: A Deobfuscation Pre-Training Objective for Programming Languages | https://ml4code.github.io/publications/roziere2021dobf/ |
| SynCoBERT: Syntax-Guided Multi-Modal Contrastive Pre-Training for Code Representation | https://ml4code.github.io/publications/wang2021syncobert/ |
| Unified Pre-training for Program Understanding and Generation | https://ml4code.github.io/publications/ahmad2021unified/ |
| Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations | https://ml4code.github.io/publications/bui2021efficient/ |
| An Exploratory Study on Code Attention in BERT | https://ml4code.github.io/publications/sharma2022exploratory/ |
| What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Code | https://ml4code.github.io/publications/wan2022what/ |
| A Factor Graph Model for Software Bug Finding | https://ml4code.github.io/publications/kremenek2007factor/ |
| Predicting Program Properties from “Big Code” | https://ml4code.github.io/publications/raychev2015predicting/ |
| A User-Guided Approach to Program Analysis | https://ml4code.github.io/publications/mangal2015user/ |
| Learning a Strategy for Adapting a Program Analysis via Bayesian Optimisation | https://ml4code.github.io/publications/oh2015learning/ |
| Gated Graph Sequence Neural Networks | https://ml4code.github.io/publications/li2016gated/ |
| Deep Learning to Find Bugs | https://ml4code.github.io/publications/pradel2017deep/ |
| Finding Likely Errors with Bayesian Specifications | https://ml4code.github.io/publications/murali2017finding/ |
| User-guided program reasoning using Bayesian inference | https://ml4code.github.io/publications/raghothaman2018user/ |
| Path-Based Function Embedding and its Application to Specification Mining | https://ml4code.github.io/publications/defreez2018path/ |
| Neural-Augumented Static Analysis of Android Communication | https://ml4code.github.io/publications/zhao2018neural/ |
| Learning Loop Invariants for Program Verification | https://ml4code.github.io/publications/si2018learning/ |
| RefiNym: Using Names to Refine Types | https://ml4code.github.io/publications/dash2018refinym/ |
| Automated Vulnerability Detection in Source Code Using Deep Representation Learning | https://ml4code.github.io/publications/russell2018automated/ |
| Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IR | https://ml4code.github.io/publications/barchi2019code/ |
| On the Feasibility of Transfer-learning Code Smells using Deep Learning | https://ml4code.github.io/publications/sharma2019feasibility/ |
| Unsupervised Learning of API Aliasing Specifications | https://ml4code.github.io/publications/ederhardt2019unsupervised/ |
| Scalable Taint Specification Inference with Big Code | https://ml4code.github.io/publications/chibotaru2019scalable/ |
| Neural Bug Finding: A Study of Opportunities and Challenges | https://ml4code.github.io/publications/habib2019neural/ |
| Neural Program Repair by Jointly Learning to Localize and Repair | https://ml4code.github.io/publications/vasic2019neural/ |
| Inferring Javascript types using Graph Neural Networks | https://ml4code.github.io/publications/schrouff2019inferring/ |
| Neural Software Analysis | https://ml4code.github.io/publications/pradel2020neural/ |
| Learning Graph Structure With A Finite-State Automaton Layer | https://ml4code.github.io/publications/johnson2020learning/ |
| Predicting Vulnerability in Large Codebases With Deep Code Representation | https://ml4code.github.io/publications/ashwath2020predicting/ |
| SinkFinder: harvesting hundreds of unknown interesting function pairs with just one seed | https://ml4code.github.io/publications/bian2020sinkfinder/ |
| Exploration of Convolutional Neural Network models for source code classification | https://ml4code.github.io/publications/barchi2021exploration/ |
| Source Code Classification for Energy Efficiency in Parallel Ultra Low-Power Microcontrollers | https://ml4code.github.io/publications/parisi2021source/ |
| Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mapping | https://ml4code.github.io/publications/parisi2022making/ |
| What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Code | https://ml4code.github.io/publications/wan2022what/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| (Partial) Program Dependence Learning | https://ml4code.github.io/publications/yadavally2023partial/ |
| A Learning-Based Approach to Static Program Slicing | https://ml4code.github.io/publications/yadavally2024learning/ |
| Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learning | https://ml4code.github.io/publications/yadavally2024predictive/ |
| Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippets | https://ml4code.github.io/publications/lomshakov2023fine/ |
| Fine-Tuning Large Language Models for Answering Programming Questions with Code Snippets | https://ml4code.github.io/publications/lomshakov2023fine/ |
| Testing Neural Program Analyzers | https://ml4code.github.io/publications/rabin2019testing/ |
| Mercem: Method Name Recommendation Based on Call Graph Embedding | https://ml4code.github.io/publications/yonai2019mercem/ |
| On the Impact of Refactoring Operations on Code Naturalness | https://ml4code.github.io/publications/lin2019impact/ |
| Recommendation of Move Method Refactoring Using Path-Based Representation of Code | https://ml4code.github.io/publications/kurbatova2020recommendation/ |
| Understanding Neural Code Intelligence Through Program Simplification | https://ml4code.github.io/publications/rabin2021understanding/ |
| On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations | https://ml4code.github.io/publications/rabin2021generalizability/ |
| Mining Idioms in the Wild | https://ml4code.github.io/publications/sivaraman2021mining/ |
| Syntax-Guided Program Reduction for Understanding Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022understanding/ |
| Memorization and Generalization in Neural Code Intelligence Models | https://ml4code.github.io/publications/rabin2022memorization/ |
| Syntax Errors Just Aren’t Natural: Improving Error Reporting with Language Models | https://ml4code.github.io/publications/campbell2014syntax/ |
| Learning Program Embeddings to Propagate Feedback on Student Code | https://ml4code.github.io/publications/piech2015learning/ |
| OverCode: visualizing variation in student solutions to programming problems at scale | https://ml4code.github.io/publications/glassman2015overcode/ |
| Automated Correction for Syntax Errors in Programming Assignments using Recurrent Neural Networks | https://ml4code.github.io/publications/bhatia2016automated/ |
| sk_p: a neural program corrector for MOOCs | https://ml4code.github.io/publications/pu2016skp/ |
| Semantic Code Repair using Neuro-Symbolic Transformation Networks | https://ml4code.github.io/publications/devlin2017semantic/ |
| DeepFix: Fixing Common C Language Errors by Deep Learning | https://ml4code.github.io/publications/gupta2017deepfix/ |
| Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities | https://ml4code.github.io/publications/white2017sorting/ |
| An Empirical Study on Learning Bug-Fixing Patches in the Wild via Neural Machine Translation | https://ml4code.github.io/publications/tufano2018empirical/ |
| Deep Reinforcement Learning for Programming Language Correction | https://ml4code.github.io/publications/gupta2018deep/ |
| Learning How to Mutate Source Code from Bug-Fixes | https://ml4code.github.io/publications/tufano2018learning/ |
| CODIT: Code Editing with Tree-Based Neural Machine Translation | https://ml4code.github.io/publications/chakraborty2018tree2tree/ |
| Learning to Generate Corrective Patches using Neural Machine Translation | https://ml4code.github.io/publications/hata2018learning/ |
| Learning to Repair Software Vulnerabilities with Generative Adversarial Networks | https://ml4code.github.io/publications/harer2018learning/ |
| Neuro-symbolic program corrector for introductory programming assignments | https://ml4code.github.io/publications/bhatia2018neurosymbolic/ |
| Syntax and Sensibility: Using language models to detect and correct syntax errors | https://ml4code.github.io/publications/santos2018syntax/ |
| SampleFix: Learning to Correct Programs by Sampling Diverse Fixes | https://ml4code.github.io/publications/hajipour2019samplefix/ |
| On Learning Meaningful Code Changes via Neural Machine Translation | https://ml4code.github.io/publications/tufano2019learning/ |
| SequenceR: Sequence-to-Sequence Learning for End-to-End Program Repair | https://ml4code.github.io/publications/chen2019sequencer/ |
| Learning to Fix Build Errors with Graph2Diff Neural Networks | https://ml4code.github.io/publications/tarlow2019learning/ |
| DeepDelta: Learning to Repair Compilation Errors | https://ml4code.github.io/publications/mesbah2019deepdelta/ |
| Neural Program Repair by Jointly Learning to Localize and Repair | https://ml4code.github.io/publications/vasic2019neural/ |
| Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair | https://ml4code.github.io/publications/tian2020evaluating/ |
| DLFix: Context-based Code Transformation Learning for Automated Program Repair | https://ml4code.github.io/publications/li2020dlfix/ |
| Hoppity: Learning Bug Detection and Repair | https://ml4code.github.io/publications/dinella2020hoppity/ |
| Graph-based, Self-Supervised Program Repair from Diagnostic Feedback | https://ml4code.github.io/publications/yasunaga2020graph/ |
| Self-Supervised Bug Detection and Repair | https://ml4code.github.io/publications/allamanis2021self/ |
| Learning to Find Naming Issues with Big Code and Small Supervision | https://ml4code.github.io/publications/he2021learning/ |
| A Semantic Bug Seeding: A Learning-Based Approach for Creating Realistic Bugs | https://ml4code.github.io/publications/patra2021semantic/ |
| Neural Program Repair with Execution-based Backpropagation | https://ml4code.github.io/publications/ye2021neural/ |
| Fix-Filter-Fix: Intuitively Connect Any Models for Effective Bug Fixing | https://ml4code.github.io/publications/hong2021fix/ |
| DeepMerge: Learning to Merge Programs | https://ml4code.github.io/publications/dinella2021deepmerge/ |
| Learning to Extend Program Graphs to Work-in-Progress Code | https://ml4code.github.io/publications/li2021learning/ |
| DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons | https://ml4code.github.io/publications/drain2021deepdebug/ |
| Generating Bug-Fixes Using Pretrained Transformers | https://ml4code.github.io/publications/drain2021generating/ |
| TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer | https://ml4code.github.io/publications/berabi2021tfix/ |
| PLUR: A Unifying, Graph-Based View of Program Learning, Understanding, and Repair | https://ml4code.github.io/publications/chen2021plur/ |
| SelfAPR: Self-supervised Program Repair with Test Execution Diagnostics | https://ml4code.github.io/publications/ye2022selfapr/ |
| Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixes | https://ml4code.github.io/publications/richter2022can/ |
| Using Developer Discussions to Guide Fixing Bugs in Software | https://ml4code.github.io/publications/panthaplackel2022using/ |
| Demystifying GPT Self-Repair for Code Generation | https://ml4code.github.io/publications/olausson2023demystifying/ |
| TraceFixer: Execution Trace-Driven Program Repair | https://ml4code.github.io/publications/bouzenia2023tracefixer/ |
| Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models | https://ml4code.github.io/publications/saberi2023model/ |
| SkipAnalyzer: A Tool for Static Code Analysis with Large Language Models | https://ml4code.github.io/publications/mohajer2023skipanalyzer/ |
| RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair | https://ml4code.github.io/publications/silva2023repairllama/ |
| DebugBench: Evaluating Debugging Capability of Large Language Models | https://ml4code.github.io/publications/tian2024debugbench/ |
| T5APR: Empowering Automated Program Repair across Languages through Checkpoint Ensemble | https://ml4code.github.io/publications/gharibi2024t5apr/ |
| RepairAgent: An Autonomous, LLM-Based Agent for Program Repair | https://ml4code.github.io/publications/bouzenia2024repairagent/ |
| DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models | https://ml4code.github.io/publications/berabi2024deepcode/ |
| Building Program Vector Representations for Deep Learning | https://ml4code.github.io/publications/mou2014building/ |
| Learning to Execute | https://ml4code.github.io/publications/zaremba2014learning/ |
| Exploring the Use of Deep Learning for Feature Location | https://ml4code.github.io/publications/corley2015exploring/ |
| Learning Program Embeddings to Propagate Feedback on Student Code | https://ml4code.github.io/publications/piech2015learning/ |
| Toward Deep Learning Software Repositories | https://ml4code.github.io/publications/white2015toward/ |
| Graph-based Statistical Language Model for Code | https://ml4code.github.io/publications/nguyen2015graph/ |
| Learning to Generate Pseudo-code from Source Code using Statistical Machine Translation | https://ml4code.github.io/publications/oda2015learning/ |
| Learning API Usages from Bytecode: A Statistical Approach | https://ml4code.github.io/publications/nguyen2016learning/ |
| Convolutional Neural Networks over Tree Structures for Programming Language Processing | https://ml4code.github.io/publications/mou2016convolutional/ |
| Bugram: bug detection with n-gram language models | https://ml4code.github.io/publications/wang2016bugram/ |
| Automatically Learning Semantic Features for Defect Prediction | https://ml4code.github.io/publications/wang2016automatically/ |
| Automatically generating features for learning program analysis heuristics | https://ml4code.github.io/publications/chae2016automatically/ |
| Semantically enhanced software traceability using deep learning techniques | https://ml4code.github.io/publications/guo2017semantically/ |
| SmartPaste: Learning to Adapt Source Code | https://ml4code.github.io/publications/allamanis2017smartpaste/ |
| Neural Attribute Machines for Program Generation | https://ml4code.github.io/publications/amodio2017neural/ |
| Exploring API Embedding for API Usages and Applications | https://ml4code.github.io/publications/nguyen2017exploring/ |
| Hierarchical Learning of Cross-Language Mappings through Distributed Vector Representations for Code | https://ml4code.github.io/publications/bui2018hierarchical/ |
| Bilateral Dependency Neural Networks for Cross-Language Algorithm Classification | https://ml4code.github.io/publications/bui2018bilateral/ |
| Path-Based Function Embedding and its Application to Specification Mining | https://ml4code.github.io/publications/defreez2018path/ |
| Cross-Language Learning for Program Classification using Bilateral Tree-Based Convolutional Neural Networks | https://ml4code.github.io/publications/bui2018cross/ |
| Open Vocabulary Learning on Source Code with a Graph-Structured Cache | https://ml4code.github.io/publications/cvitkovic2018open/ |
| Learning to Represent Programs with Graphs | https://ml4code.github.io/publications/allamanis2018learning/ |
| Deep Learning Similarities from Different Representations of Source Code | https://ml4code.github.io/publications/tufano2018deep/ |
| Neural Code Comprehension: A Learnable Representation of Code Semantics | https://ml4code.github.io/publications/bennun2018neural/ |
| Intelligent code reviews using deep learning | https://ml4code.github.io/publications/gupta2018intelligent/ |
| A General Path-Based Representation for Predicting Program Properties | https://ml4code.github.io/publications/alon2018general/ |
| Deep Learning Type Inference | https://ml4code.github.io/publications/hellendoorn2018deep/ |
| code2vec: Learning Distributed Representations of Code | https://ml4code.github.io/publications/alon2019code2vec/ |
| On the Feasibility of Transfer-learning Code Smells using Deep Learning | https://ml4code.github.io/publications/sharma2019feasibility/ |
| Mercem: Method Name Recommendation Based on Call Graph Embedding | https://ml4code.github.io/publications/yonai2019mercem/ |
| Learning Execution through Neural Code Fusion | https://ml4code.github.io/publications/shi2019learning/ |
| Improving Bug Detection via Context-Based Code Representation Learning and Attention-Based Neural Networks | https://ml4code.github.io/publications/li2019improving/ |
| Capturing source code semantics via tree-based convolution over API-enhanced AST | https://ml4code.github.io/publications/chen2019capturing/ |
| Learning Uniform Semantic Features for Natural Language and Programming Language Globally, Locally and Sequentially | https://ml4code.github.io/publications/zhang2019learning/ |
| A Literature Study of Embeddings on Source Code | https://ml4code.github.io/publications/chen2019literature/ |
| code2seq: Generating Sequences from Structured Representations of Code | https://ml4code.github.io/publications/alon2018code2seq/ |
| SAR: Learning Cross-Language API Mappings with Little Knowledge | https://ml4code.github.io/publications/bui2019learning/ |
| Mining Likely Analogical APIs across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding | https://ml4code.github.io/publications/chen2019mining/ |
| PathMiner : A Library for Mining of Path-Based Representations of Code | https://ml4code.github.io/publications/kovalenko2019pathminer/ |
| Import2vec - Learning Embeddings for Software Libraries | https://ml4code.github.io/publications/theeten2019import2vec/ |
| Semantic Source Code Models Using Identifier Embeddings | https://ml4code.github.io/publications/efstathiou2019semantic/ |
| Asm2Vec: Boosting Static Representation Robustness for Binary Clone Search against Code Obfuscation and Compiler Optimization | https://ml4code.github.io/publications/ding2019asm2vec/ |
| Learning Scalable and Precise Representation of Program Semantics | https://ml4code.github.io/publications/wang2019learning/ |
| Program Classification Using Gated Graph Attention Neural Network for Online Programming Service | https://ml4code.github.io/publications/lu2019program/ |
| Neural Attribution for Semantic Bug-Localization in Student Programs | https://ml4code.github.io/publications/gupta2019neural/ |
| TreeCaps: Tree-Structured Capsule Networks for Program Source Code Processing | https://ml4code.github.io/publications/jayasundara2019treecaps/ |
| A Novel Neural Source Code Representation based on Abstract Syntax Tree | https://ml4code.github.io/publications/zhang2019novel/ |
| Modular Tree Network for Source Code Representation Learning | https://ml4code.github.io/publications/wang2020modular/ |
| Searching a Database of Source Codes Using Contextualized Code Search | https://ml4code.github.io/publications/mukherjee2020searching/ |
| Towards Demystifying Dimensions of Source Code Embeddings | https://ml4code.github.io/publications/rabin2020demystifying/ |
| Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks | https://ml4code.github.io/publications/bieber2020learning/ |
| Towards Learning Representations of Binary Executable Files for Security Tasks | https://ml4code.github.io/publications/arakelyan2020towards/ |
| ComPy-Learn: A toolbox for exploring machine learning representations for compilers | https://ml4code.github.io/publications/brauckmann2020compy/ |
| Compiler-based graph representations for deep learning models of code | https://ml4code.github.io/publications/brauckmann2020compiler/ |
| Contrastive Code Representation Learning | https://ml4code.github.io/publications/jain2020contrastive/ |
| Unsupervised Learning of General-Purpose Embeddings for Code Changes | https://ml4code.github.io/publications/pravilov2021unsupervised/ |
| Contrastive Learning for Source Code with Structural and Functional Properties | https://ml4code.github.io/publications/ding2021contrastive/ |
| Disentangled Code Representation Learning for Multiple Programming Languages | https://ml4code.github.io/publications/zhang2021disentangled/ |
| IdBench: Evaluating Semantic Representations of Identifier Names in Source Code | https://ml4code.github.io/publications/waunakh2019idbench/ |
| Multimodal Representation for Neural Code Search | https://ml4code.github.io/publications/jian2021multimodal/ |
| MulCode: A Multi-task Learning Approach for Source Code Understanding | https://ml4code.github.io/publications/deze2021mulcode/ |
| Language-Agnostic Representation Learning of Source Code from Structure and Context | https://ml4code.github.io/publications/zugner2021language/ |
| InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees | https://ml4code.github.io/publications/bui2021infercode/ |
| Learning Program Semantics with Code Representations: An Empirical Study | https://ml4code.github.io/publications/siow2022learning/ |
| Bridging Pre-trained Models and Downstream Tasks for Source Code Understanding | https://ml4code.github.io/publications/deze2022bridging/ |
| SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations | https://ml4code.github.io/publications/niu2022spt-code/ |
| LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognition | https://ml4code.github.io/publications/sharma2022lamner/ |
| An Exploratory Study on Code Attention in BERT | https://ml4code.github.io/publications/sharma2022exploratory/ |
| CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation | https://ml4code.github.io/publications/pashakhanloo2022codetrek/ |
| Topical: Learning Repository Embeddings from Source Code using Attention | https://ml4code.github.io/publications/lherondelle2022topical/ |
| Rethinking Negative Pairs in Code Search | https://ml4code.github.io/publications/li2023rethinking/ |
| Rethinking Negative Pairs in Code Search | https://ml4code.github.io/publications/li2023rethinking/ |
| RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation | https://ml4code.github.io/publications/zhang2023repocoder/ |
| Learning to Reverse DNNs from AI Programs Automatically | https://ml4code.github.io/publications/chen2022learning/ |
| Will they like this? Evaluating Code Contributions With Language Models | https://ml4code.github.io/publications/hellendoorn2015will/ |
| Intelligent code reviews using deep learning | https://ml4code.github.io/publications/gupta2018intelligent/ |
| CORE: Automating Review Recommendation for Code Changes | https://ml4code.github.io/publications/siow2019core/ |
| Deep Learning Approaches to Source Code Analysis for Optimization of Heterogeneous Systems: Recent Results, Challenges and Opportunities | https://ml4code.github.io/publications/barchi2022deep/ |
| CodeReviewer: Pre-Training for Automating Code Review Activities | https://ml4code.github.io/publications/li2022codereviewer/ |
| What is it like to program with artificial intelligence? | https://ml4code.github.io/publications/sarkar2022what/ |
| Aroma: code recommendation via structural code search | https://ml4code.github.io/publications/luan2019aroma/ |
| A Bimodal Modelling of Source Code and Natural Language | https://ml4code.github.io/publications/allamanis2015bimodal/ |
| Deep API Learning | https://ml4code.github.io/publications/gu2016deep/ |
| Deep Code Search | https://ml4code.github.io/publications/gu2018deep/ |
| A Retrieve-and-Edit Framework for Predicting Structured Outputs | https://ml4code.github.io/publications/hashimoto2018retrieve/ |
| CodeSearchNet Challenge: Evaluating the State of Semantic Code Search | https://ml4code.github.io/publications/husain2019codesearchnet/ |
| Neural Code Search Evaluation Dataset | https://ml4code.github.io/publications/li2019neural/ |
| Multi-Modal Attention Network Learning for Semantic Source Code Retrieval | https://ml4code.github.io/publications/wan2019multimodal/ |
| CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning | https://ml4code.github.io/publications/yao2019coacor/ |
| When Deep Learning Met Code Search | https://ml4code.github.io/publications/cambronero2019deep/ |
| Neural query expansion for code search | https://ml4code.github.io/publications/liu2019neural/ |
| Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent | https://ml4code.github.io/publications/heyman2020neural/ |
| CoNCRA: A Convolutional Neural Network Code Retrieval Approach | https://ml4code.github.io/publications/derezendemartins2020concra/ |
| Searching a Database of Source Codes Using Contextualized Code Search | https://ml4code.github.io/publications/mukherjee2020searching/ |
| Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning | https://ml4code.github.io/publications/ye2020leveraging/ |
| TranS^3: A Transformer-based Framework for Unifying Code Summarization and Code Search | https://ml4code.github.io/publications/wang2020trans/ |
| PSCS: A Path-based Neural Model for Semantic Code Search | https://ml4code.github.io/publications/sun2020pscs/ |
| Improving Code Search with Co-Attentive Representation Learning | https://ml4code.github.io/publications/shuai2020improving/ |
| A Multi-Perspective Architecture for Semantic Code Search | https://ml4code.github.io/publications/haldar2020multiperspective/ |
| Adaptive Deep Code Search | https://ml4code.github.io/publications/ling2020adaptive/ |
| Are the Code Snippets What We Are Searching for? A Benchmark and an Empirical Study on Code Search with Natural-Language Queries | https://ml4code.github.io/publications/yan2020are/ |
| NaturalCC: A Toolkit to Naturalize the Source Code Corpus | https://ml4code.github.io/publications/wan2020naturalcc/ |
| Deep Graph Matching and Searching for Semantic Code Retrieval | https://ml4code.github.io/publications/ling2020deep/ |
| Learning Code-Query Interaction for Enhancing Code Searches | https://ml4code.github.io/publications/li2020learning/ |
| OCoR: An Overlapping-Aware Code Retriever | https://ml4code.github.io/publications/zhu2020ocor/ |
| CoSQA: 20,000+ Web Queries for Code Search and Question Answering | https://ml4code.github.io/publications/huang2021cosqa/ |
| DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning | https://ml4code.github.io/publications/ellis2021dreamcoder/ |
| Multimodal Representation for Neural Code Search | https://ml4code.github.io/publications/jian2021multimodal/ |
| Bag-of-Words Baselines for Semantic Code Search | https://ml4code.github.io/publications/zhang2021bag/ |
| Distilling Transformers for Neural Cross-Domain Search | https://ml4code.github.io/publications/clement2021distilling/ |
| Leveraging Language to Learn Program Abstractions and Search Heuristics | https://ml4code.github.io/publications/wong2021leveraging/ |
| Self-Supervised Contrastive Learning for Code Retrieval and Summarization via Semantic-Preserving Transformations | https://ml4code.github.io/publications/bui2021efficient/ |
| Exploring Representation-Level Augmentation for Code Search | https://ml4code.github.io/publications/li2022exploring/ |
| Senatus - A Fast and Accurate Code-to-Code Recommendation Engine | https://ml4code.github.io/publications/silavong2022senatus/ |
| DocCoder: Generating Code by Retrieving and Reading Docs | https://ml4code.github.io/publications/zhou2022docoder/ |
| CodeDSI: Differentiable Code Search | https://ml4code.github.io/publications/nadeem2022codedsi/ |
| Rethinking Negative Pairs in Code Search | https://ml4code.github.io/publications/li2023rethinking/ |
| Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search | https://ml4code.github.io/publications/li2024rewriting/ |
| A Learning-Based Approach to Static Program Slicing | https://ml4code.github.io/publications/yadavally2024learning/ |
| Learning a Classifier for False Positive Error Reports Emitted by Static Code Analysis Tools | https://ml4code.github.io/publications/koc2017learning/ |
| Code Mapping in Heterogeneous Platforms Using Deep Learning and LLVM-IR | https://ml4code.github.io/publications/barchi2019code/ |
| Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks | https://ml4code.github.io/publications/zhou2019devign/ |
| Predicting Vulnerability in Large Codebases With Deep Code Representation | https://ml4code.github.io/publications/ashwath2020predicting/ |
| Exploration of Convolutional Neural Network models for source code classification | https://ml4code.github.io/publications/barchi2021exploration/ |
| Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device Mapping | https://ml4code.github.io/publications/parisi2022making/ |
| Learning to Answer Semantic Queries over Code | https://ml4code.github.io/publications/sahu2022learning/ |
| Learning to Reduce False Positives in Analytic Bug Detectors | https://ml4code.github.io/publications/kharkar2022learning/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| The Hitchhiker's Guide to Program Analysis: A Journey with Large Language Models | https://ml4code.github.io/publications/li2023hitchhiker/ |
| A Static Evaluation of Code Completion by Large Language Models | https://ml4code.github.io/publications/ding2023static/ |
| (Partial) Program Dependence Learning | https://ml4code.github.io/publications/yadavally2023partial/ |
| Beware of the Unexpected: Bimodal Taint Analysis | https://ml4code.github.io/publications/chow2023beware/ |
| Learning Natural Coding Conventions | https://ml4code.github.io/publications/allamanis2014learning/ |
| STYLE-ANALYZER: fixing code style inconsistencies with interpretable unsupervised algorithms | https://ml4code.github.io/publications/markovtsev2019style/ |
| Natural Language Models for Predicting Programming Comments | https://ml4code.github.io/publications/movshovitz2013natural/ |
| A Convolutional Attention Network for Extreme Summarization of Source Code | https://ml4code.github.io/publications/allamanis2016convolutional/ |
| Summarizing Source Code using a Neural Attention Model | https://ml4code.github.io/publications/iyer2016summarizing/ |
| Autofolding for Source Code Summarization | https://ml4code.github.io/publications/fowkes2017autofolding/ |
| Abridging Source Code | https://ml4code.github.io/publications/yuan2017abridging/ |
| CodeSum: Translate Program Language to Natural Language | https://ml4code.github.io/publications/hu2017codesum/ |
| A parallel corpus of Python functions and documentation strings for automated code documentation and code generation | https://ml4code.github.io/publications/barone2017parallel/ |
| A Neural Architecture for Generating Natural Language Descriptions from Source Code Changes | https://ml4code.github.io/publications/loyola2017neural/ |
| Content Aware Source Code Change Description Generation | https://ml4code.github.io/publications/loyola2018content/ |
| Improving Automatic Source Code Summarization via Deep Reinforcement Learning | https://ml4code.github.io/publications/wan2018improving/ |
| Neural-Machine-Translation-Based Commit Message Generation: How Far Are We? | https://ml4code.github.io/publications/liu2018neural/ |
| code2vec: Learning Distributed Representations of Code | https://ml4code.github.io/publications/alon2019code2vec/ |
| A Neural Model for Method Name Generation from Functional Description | https://ml4code.github.io/publications/gao2019neural/ |
| code2seq: Generating Sequences from Structured Representations of Code | https://ml4code.github.io/publications/alon2018code2seq/ |
| Commit Message Generation for Source Code Changes | https://ml4code.github.io/publications/xu2019commit/ |
| Code Generation as a Dual Task of Code Summarization | https://ml4code.github.io/publications/wei2019code/ |
| Structured Neural Summarization | https://ml4code.github.io/publications/fernandes2019structured/ |
| A Neural Model for Generating Natural Language Summaries of Program Subroutines | https://ml4code.github.io/publications/leclair2019neural/ |
| Recommendations for Datasets for Source Code Summarization | https://ml4code.github.io/publications/leclair2019recommendations/ |
| Automatic Source Code Summarization with Extended Tree-LSTM | https://ml4code.github.io/publications/shido2019automatic/ |
| Improved Automatic Summarization of Subroutines via Attention to File Context | https://ml4code.github.io/publications/haque2020improved/ |
| Leveraging Code Generation to Improve Code Retrieval and Summarization via Dual Learning | https://ml4code.github.io/publications/ye2020leveraging/ |
| A Transformer-based Approach for Source Code Summarization | https://ml4code.github.io/publications/ahmad2020transformer/ |
| PyMT5: multi-mode translation of natural language and Python code with transformers | https://ml4code.github.io/publications/clement2020pymt5/ |
| NaturalCC: A Toolkit to Naturalize the Source Code Corpus | https://ml4code.github.io/publications/wan2020naturalcc/ |
| Improved Code Summarization via a Graph Neural Network | https://ml4code.github.io/publications/leclair2020improved/ |
| CoCoGUM: Contextual Code Summarization with Multi-Relational GNN on UMLs | https://ml4code.github.io/publications/wang2020cocogum/ |
| Learning to Represent Programs with Heterogeneous Graphs | https://ml4code.github.io/publications/wang2020learning2/ |
| On the Generalizability of Neural Program Models with respect to Semantic-Preserving Program Transformations | https://ml4code.github.io/publications/rabin2021generalizability/ |
| Retrieval Augmented Code Generation and Summarization | https://ml4code.github.io/publications/parvez2021retrieval/ |
| Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors | https://ml4code.github.io/publications/mahmud2021code/ |
| Learning to Describe Solutions for Bug Reports Based on Developer Discussions | https://ml4code.github.io/publications/panthaplackel2021learning/ |
| Assemble Foundation Models for Automatic Code Summarization | https://ml4code.github.io/publications/jian2022assemble/ |
| InCoder: A Generative Model for Code Infilling and Synthesis | https://ml4code.github.io/publications/fried2022incoder/ |
| LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognition | https://ml4code.github.io/publications/sharma2022lamner/ |
| Learning code summarization from a small and local dataset | https://ml4code.github.io/publications/ahmed2022learning/ |
| Improving Few-Shot Prompts with Relevant Static Analysis Products | https://ml4code.github.io/publications/ahmed2033improving/ |
| Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models | https://ml4code.github.io/publications/saberi2023model/ |
| A Survey on Deep Learning for Software Engineering | https://ml4code.github.io/publications/yang2020survey/ |
| Neural Software Analysis | https://ml4code.github.io/publications/pradel2020neural/ |
| Deep Learning & Software Engineering: State of Research and Future Directions | https://ml4code.github.io/publications/devanbu2020deep/ |
| Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors | https://ml4code.github.io/publications/mahmud2021code/ |
| A Systematic Literature Review on the Use of Deep Learning in Software Engineering Research | https://ml4code.github.io/publications/watson2021systematic/ |
| Deep Learning based Vulnerability Detection: Are We There Yet? | https://ml4code.github.io/publications/chakraborty2020deep/ |
| A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Tasks | https://ml4code.github.io/publications/casey2024survey/ |
| NLyze: Interactive Programming by Natural Language for SpreadSheet Data Analysis and Manipulation | https://ml4code.github.io/publications/gulwani2014nlyze/ |
| Synthesizing Java expressions from free-form queries | https://ml4code.github.io/publications/gvero2015synthesizing/ |
| SPoC: Search-based Pseudocode to Code | https://ml4code.github.io/publications/kulal2019spoc/ |
| AutoPandas: neural-backed generators for program synthesis | https://ml4code.github.io/publications/bavishi2019autopandas/ |
| Semantic Scaffolds for Pseudocode-to-Code Generation | https://ml4code.github.io/publications/zhong2020semantic/ |
| Unit Test Case Generation with Transformers | https://ml4code.github.io/publications/tufano2020unit/ |
| Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers | https://ml4code.github.io/publications/tufano2020generating/ |
| IntelliCode Compose: Code Generation Using Transformer | https://ml4code.github.io/publications/svyatkovskiy2020intellicode/ |
| Evaluating Large Language Models Trained on Code | https://ml4code.github.io/publications/chen2021evaluating/ |
| DreamCoder: bootstrapping inductive program synthesis with wake-sleep library learning | https://ml4code.github.io/publications/ellis2021dreamcoder/ |
| Program Synthesis with Large Language Models | https://ml4code.github.io/publications/nye2021program/ |
| A large-scale benchmark for few-shot program induction and synthesis | https://ml4code.github.io/publications/alet2021largescale/ |
| Leveraging Language to Learn Program Abstractions and Search Heuristics | https://ml4code.github.io/publications/wong2021leveraging/ |
| Neural Program Generation Modulo Static Analysis | https://ml4code.github.io/publications/mukherjee2021neural/ |
| A Conversational Paradigm for Program Synthesis | https://ml4code.github.io/publications/nijkamp2022conversational/ |
| I Speak, You Verify: Toward Trustworthy Neural Program Synthesis | https://ml4code.github.io/publications/key2022speak/ |
| CodeT: Code Generation with Generated Tests | https://ml4code.github.io/publications/chen2022codet/ |
| Grounded Copilot: How Programmers Interact with Code-Generating Models | https://ml4code.github.io/publications/barke2022grounded/ |
| Unit Test Case Generation with Transformers | https://ml4code.github.io/publications/tufano2020unit/ |
| Generating Accurate Assert Statements for Unit Test Cases using Pretrained Transformers | https://ml4code.github.io/publications/tufano2020generating/ |
| TOGA: A Neural Method for Test Oracle Generation | https://ml4code.github.io/publications/dinella2022toga/ |
| Test-based and metric-based evaluation of code generation models for practical question answering | https://ml4code.github.io/publications/kovalchuk2023test/ |
| PSIMiner: A Tool for Mining Rich Abstract Syntax Trees from Code | https://ml4code.github.io/publications/spirin2021psiminer/ |
| Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context | https://ml4code.github.io/publications/agrawal2023monitor/ |
| (Partial) Program Dependence Learning | https://ml4code.github.io/publications/yadavally2023partial/ |
| A Learning-Based Approach to Static Program Slicing | https://ml4code.github.io/publications/yadavally2024learning/ |
| Predictive Program Slicing via Execution Knowledge-Guided Dynamic Dependence Learning | https://ml4code.github.io/publications/yadavally2024predictive/ |
| Topic modeling of public repositories at scale using names in source code | https://ml4code.github.io/publications/markovtsev2017topic/ |
| Topical: Learning Repository Embeddings from Source Code using Attention | https://ml4code.github.io/publications/lherondelle2022topical/ |
| Semantically enhanced software traceability using deep learning techniques | https://ml4code.github.io/publications/guo2017semantically/ |
| Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair | https://ml4code.github.io/publications/tian2020evaluating/ |
| Global Relational Models of Source Code | https://ml4code.github.io/publications/hellendoorn2020global/ |
| Empirical Study of Transformers for Source Code | https://ml4code.github.io/publications/chirkova2020empirical/ |
| Self-Supervised Bug Detection and Repair | https://ml4code.github.io/publications/allamanis2021self/ |
| Retrieval Augmented Code Generation and Summarization | https://ml4code.github.io/publications/parvez2021retrieval/ |
| Show Your Work: Scratchpads for Intermediate Computation with Language Models | https://ml4code.github.io/publications/nye2021show/ |
| ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback | https://ml4code.github.io/publications/wu2021prototransformer/ |
| CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model | https://ml4code.github.io/publications/jung2021commitbert/ |
| CoTexT: Multi-task Learning with Code-Text Transformer | https://ml4code.github.io/publications/phan2021cotext/ |
| Code to Comment Translation: A Comparative Study on Model Effectiveness & Errors | https://ml4code.github.io/publications/mahmud2021code/ |
| ConTest: A Unit Test Completion Benchmark featuring Context | https://ml4code.github.io/publications/villmow2021contest/ |
| Contrastive Learning for Source Code with Structural and Functional Properties | https://ml4code.github.io/publications/ding2021contrastive/ |
| Jointly Learning to Repair Code and Generate Commit Message | https://ml4code.github.io/publications/bai2021jointly/ |
| Co-Training for Commit Classification | https://ml4code.github.io/publications/lee2021cotraining/ |
| Toward Less Hidden Cost of Code Completion with Acceptance and Ranking Models | https://ml4code.github.io/publications/li2021toward/ |
| Learning Type Annotation: Is Big Data Enough? | https://ml4code.github.io/publications/jesse2021learning/ |
| TreeBERT: A Tree-Based Pre-Trained Model for Programming Language | https://ml4code.github.io/publications/jiang2021treebert/ |
| Program Synthesis with Large Language Models | https://ml4code.github.io/publications/nye2021program/ |
| An Empirical Cybersecurity Evaluation of GitHub Copilot's Code Contributions | https://ml4code.github.io/publications/pearce2021empirical/ |
| Reading StackOverflow Encourages Cheating: Adding Question Text Improves Extractive Code Generation | https://ml4code.github.io/publications/orlanski2021reading/ |
| How could Neural Networks understand Programs? | https://ml4code.github.io/publications/peng2021how/ |
| Learning to Extend Program Graphs to Work-in-Progress Code | https://ml4code.github.io/publications/li2021learning/ |
| DeepDebug: Fixing Python Bugs Using Stack Traces, Backtranslation, and Code Skeletons | https://ml4code.github.io/publications/drain2021deepdebug/ |
| Generating Bug-Fixes Using Pretrained Transformers | https://ml4code.github.io/publications/drain2021generating/ |
| CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation | https://ml4code.github.io/publications/wang2021codet5/ |
| Improving Code Autocompletion with Transfer Learning | https://ml4code.github.io/publications/zhou2021improving/ |
| Long-Range Modeling of Source Code Files with eWASH: Extended Window Access by Syntax Hierarchy | https://ml4code.github.io/publications/clement2021long/ |
| Language-Agnostic Representation Learning of Source Code from Structure and Context | https://ml4code.github.io/publications/zugner2021language/ |
| Distilling Transformers for Neural Cross-Domain Search | https://ml4code.github.io/publications/clement2021distilling/ |
| Time-Efficient Code Completion Model for the R Programming Language | https://ml4code.github.io/publications/popov2021time/ |
| What do pre-trained code models know about code? | https://ml4code.github.io/publications/karmakar2021what/ |
| CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing | https://ml4code.github.io/publications/elnaggar2021codetrans/ |
| DIRECT : A Transformer-based Model for Decompiled Identifier Renaming | https://ml4code.github.io/publications/nitin2021direct/ |
| On Multi-Modal Learning of Editing Source Code | https://ml4code.github.io/publications/chakraborty2021multimodal/ |
| CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation | https://ml4code.github.io/publications/lu2021codexglue/ |
| Unified Pre-training for Program Understanding and Generation | https://ml4code.github.io/publications/ahmad2021unified/ |
| Code Translation with Compiler Representations | https://ml4code.github.io/publications/szafraniec2022code/ |
| Learning to Model Editing Processes | https://ml4code.github.io/publications/reid2022learning/ |
| SantaCoder: don’t reach for the stars! | https://ml4code.github.io/publications/allal2022santacoder/ |
| Learning To Predict User-Defined Types | https://ml4code.github.io/publications/jesse2022learning/ |
| Efficient Training of Language Models to Fill in the Middle | https://ml4code.github.io/publications/bavarian2022efficient/ |
| CoditT5: Pretraining for Source Code and Natural Language Editing | https://ml4code.github.io/publications/zhang2022coditt5/ |
| Piloting Copilot and Codex: Hot Temperature, Cold Prompts, or Black Magic? | https://ml4code.github.io/publications/doderlein2022piloting/ |
| Exploring Representation-Level Augmentation for Code Search | https://ml4code.github.io/publications/li2022exploring/ |
| A Systematic Evaluation of Large Language Models of Code | https://ml4code.github.io/publications/xu2022systematic/ |
| A Conversational Paradigm for Program Synthesis | https://ml4code.github.io/publications/nijkamp2022conversational/ |
| Synchromesh: Reliable code generation from pre-trained language models | https://ml4code.github.io/publications/poesia2022synchromesh/ |
| An Extensive Study on Pre-trained Models for Program Understanding and Generation | https://ml4code.github.io/publications/zeng2022extensive/ |
| TOGA: A Neural Method for Test Oracle Generation | https://ml4code.github.io/publications/dinella2022toga/ |
| Learning to Complete Code with Sketches | https://ml4code.github.io/publications/guo2022learning/ |
| UniXcoder: Unified Cross-Modal Pre-training for Code Representation | https://ml4code.github.io/publications/guo2022unixcoder/ |
| Repository-Level Prompt Generation for Large Language Models of Code | https://ml4code.github.io/publications/shrivastava2020repository/ |
| InCoder: A Generative Model for Code Infilling and Synthesis | https://ml4code.github.io/publications/fried2022incoder/ |
| Can we learn from developer mistakes? Learning to localize and repair real bugs from real bug fixes | https://ml4code.github.io/publications/richter2022can/ |
| CodeT: Code Generation with Generated Tests | https://ml4code.github.io/publications/chen2022codet/ |
| SPT-Code: Sequence-to-Sequence Pre-Training for Learning Source Code Representations | https://ml4code.github.io/publications/niu2022spt-code/ |
| DocCoder: Generating Code by Retrieving and Reading Docs | https://ml4code.github.io/publications/zhou2022docoder/ |
| ReACC: A Retrieval-Augmented Code Completion Framework | https://ml4code.github.io/publications/lu2022reacc/ |
| Probing Semantic Grounding in Language Models of Code with Representational Similarity Analysis | https://ml4code.github.io/publications/naik2022probing/ |
| Using Developer Discussions to Guide Fixing Bugs in Software | https://ml4code.github.io/publications/panthaplackel2022using/ |
| CV4Code: Sourcecode Understanding via Visual Code Representations | https://ml4code.github.io/publications/shi2022cv4code/ |
| An Exploratory Study on Code Attention in BERT | https://ml4code.github.io/publications/sharma2022exploratory/ |
| Exploring Dimensions of Generalizability and Few-shot Transfer for Text-to-SQL Semantic Parsing | https://ml4code.github.io/publications/patil2022exploring/ |
| Code Generation Tools (Almost) for Free? A Study of Few-Shot, Pre-Trained Language Models on Code | https://ml4code.github.io/publications/bareiss2022code/ |
| Learning code summarization from a small and local dataset | https://ml4code.github.io/publications/ahmed2022learning/ |
| Learning to Answer Semantic Queries over Code | https://ml4code.github.io/publications/sahu2022learning/ |
| DeepPERF: A Deep Learning-Based Approach For Improving Software Performance | https://ml4code.github.io/publications/garg2022deepperf/ |
| Using Deep Learning to Generate Complete Log Statements | https://ml4code.github.io/publications/mastropaolo2022using/ |
| Learning to Reduce False Positives in Analytic Bug Detectors | https://ml4code.github.io/publications/kharkar2022learning/ |
| Exploring and Evaluating Personalized Models for Code Generation | https://ml4code.github.io/publications/zlotchevski2022exploring/ |
| What Do They Capture? -- A Structural Analysis of Pre-Trained Language Models for Source Code | https://ml4code.github.io/publications/wan2022what/ |
| Improving Few-Shot Prompts with Relevant Static Analysis Products | https://ml4code.github.io/publications/ahmed2033improving/ |
| CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code | https://ml4code.github.io/publications/zhou2022codebertscore/ |
| StarCoder: may the source be with you! | https://ml4code.github.io/publications/li2023starcoder/ |
| Large Language Models and Simple, Stupid Bugs | https://ml4code.github.io/publications/jesse2023large/ |
| TypeT5: Seq2seq Type Inference using Static Analysis | https://ml4code.github.io/publications/wei2023typet5/ |
| TraceFixer: Execution Trace-Driven Program Repair | https://ml4code.github.io/publications/bouzenia2023tracefixer/ |
| Model-Agnostic Syntactical Information for Pre-Trained Programming Language Models | https://ml4code.github.io/publications/saberi2023model/ |
| Rethinking Negative Pairs in Code Search | https://ml4code.github.io/publications/li2023rethinking/ |
| CodeGen2: Lessons for Training LLMs on Programming and Natural Languages | https://ml4code.github.io/publications/nijkamp2023codegen2/ |
| RepoCoder: Repository-Level Code Completion Through Iterative Retrieval and Generation | https://ml4code.github.io/publications/zhang2023repocoder/ |
| Code Execution with Pre-trained Language Models | https://ml4code.github.io/publications/liu2023code/ |
| DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection | https://ml4code.github.io/publications/chen2023diversevul/ |
| CodeScore: Evaluating Code Generation by Learning Code Execution | https://ml4code.github.io/publications/dong2023codescore/ |
| CodeT5+: Open Code Large Language Models for Code Understanding and Generation | https://ml4code.github.io/publications/wang2023codet5/ |
| Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation | https://ml4code.github.io/publications/li2023think/ |
| T5APR: Empowering Automated Program Repair across Languages through Checkpoint Ensemble | https://ml4code.github.io/publications/gharibi2024t5apr/ |
| Studying LLM Performance on Closed- and Open-source Data | https://ml4code.github.io/publications/ahmed2024studying/ |
| DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | https://ml4code.github.io/publications/guo2024deepseek/ |
| Automatically Testing Functional Properties of Code Translation Models | https://ml4code.github.io/publications/eniser2023automatically/ |
| LLM4Decompile: Decompiling Binary Code with Large Language Models | https://ml4code.github.io/publications/tan2024llm4decompile/ |
| Predicting Program Properties from “Big Code” | https://ml4code.github.io/publications/raychev2015predicting/ |
| RefiNym: Using Names to Refine Types | https://ml4code.github.io/publications/dash2018refinym/ |
| Deep Learning Type Inference | https://ml4code.github.io/publications/hellendoorn2018deep/ |
| TypeWriter: Neural Type Prediction with Search-based Validation | https://ml4code.github.io/publications/pradel2019typewriter/ |
| NL2Type: Inferring JavaScript Function Types from Natural Language Information | https://ml4code.github.io/publications/malik2019nl2type/ |
| Inferring Javascript types using Graph Neural Networks | https://ml4code.github.io/publications/schrouff2019inferring/ |
| Learning Lenient Parsing & Typing via Indirect Supervision | https://ml4code.github.io/publications/ahmed2019learning/ |
| OptTyper: Probabilistic Type Inference by Optimising Logical and Natural Constraints | https://ml4code.github.io/publications/pandi2020opttyper/ |
| LambdaNet: Probabilistic Type Inference using Graph Neural Networks | https://ml4code.github.io/publications/wei2020lambdanet/ |
| Adversarial Robustness for Code | https://ml4code.github.io/publications/bielik2020adversarial/ |
| Typilus: Neural Type Hints | https://ml4code.github.io/publications/allamanis2020typilus/ |
| Learning Type Annotation: Is Big Data Enough? | https://ml4code.github.io/publications/jesse2021learning/ |
| ManyTypes4Py: A Benchmark Python Dataset for Machine Learning-based Type Inference | https://ml4code.github.io/publications/mir2021manytypes4py/ |
| Type4Py: Deep Similarity Learning-Based Type Inference for Python | https://ml4code.github.io/publications/mir2021type4py/ |
| Learning To Predict User-Defined Types | https://ml4code.github.io/publications/jesse2022learning/ |
| LAMNER: Code Comment Generation Using Character Language Model and Named Entity Recognition | https://ml4code.github.io/publications/sharma2022lamner/ |
| TypeT5: Seq2seq Type Inference using Static Analysis | https://ml4code.github.io/publications/wei2023typet5/ |
| Generative Type Inference for Python | https://ml4code.github.io/publications/peng2023generative/ |
| SmartPaste: Learning to Adapt Source Code | https://ml4code.github.io/publications/allamanis2017smartpaste/ |
| Open Vocabulary Learning on Source Code with a Graph-Structured Cache | https://ml4code.github.io/publications/cvitkovic2018open/ |
| Learning to Represent Programs with Graphs | https://ml4code.github.io/publications/allamanis2018learning/ |
| Neural Program Repair by Jointly Learning to Localize and Repair | https://ml4code.github.io/publications/vasic2019neural/ |
| Global Relational Models of Source Code | https://ml4code.github.io/publications/hellendoorn2020global/ |
| CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation | https://ml4code.github.io/publications/pashakhanloo2022codetrek/ |
| Learning Loop Invariants for Program Verification | https://ml4code.github.io/publications/si2018learning/ |
| ConTest: A Unit Test Completion Benchmark featuring Context | https://ml4code.github.io/publications/villmow2021contest/ |
| DeepVD: Toward Class-Separation Features for Neural Network Vulnerability Detection | https://ml4code.github.io/publications/wang2023deepvd/ |
| DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection | https://ml4code.github.io/publications/chen2023diversevul/ |
| DeepCode AI Fix: Fixing Security Vulnerabilities with Large Language Models | https://ml4code.github.io/publications/berabi2024deepcode/ |
| A Survey of Source Code Representations for Machine Learning-Based Cybersecurity Tasks | https://ml4code.github.io/publications/casey2024survey/ |
Viewport: width=device-width, initial-scale=1.0, maximum-scale=1