Title: Issues with built-in python, java, and go grammars · Issue #137 · structuredllm/syncode · GitHub
Open Graph Title: Issues with built-in python, java, and go grammars · Issue #137 · structuredllm/syncode
X Title: Issues with built-in python, java, and go grammars · Issue #137 · structuredllm/syncode
Description: I am experiencing issues with the built-in grammars for Java, Python, and Go. The Python and Java grammars appear to be ignored, while the Go grammar produces mostly gibberish. Below is a script to reproduce the issue, along with example...
Open Graph Description: I am experiencing issues with the built-in grammars for Java, Python, and Go. The Python and Java grammars appear to be ignored, while the Go grammar produces mostly gibberish. Below is a script to...
X Description: I am experiencing issues with the built-in grammars for Java, Python, and Go. The Python and Java grammars appear to be ignored, while the Go grammar produces mostly gibberish. Below is a script to...
Opengraph URL: https://github.com/structuredllm/syncode/issues/137
X: @github
Domain: patch-diff.githubusercontent.com
{"@context":"https://schema.org","@type":"DiscussionForumPosting","headline":"Issues with built-in python, java, and go grammars","articleBody":"I am experiencing issues with the built-in grammars for Java, Python, and Go. The Python and Java grammars appear to be ignored, while the Go grammar produces mostly gibberish. Below is a script to reproduce the issue, along with example outputs. I installed with `pip install git+https://github.com/uiuc-focal-lab/syncode.git`. syncode version = 0.1 .\r\n\r\n```python\r\nimport torch\r\nfrom syncode import SyncodeLogitsProcessor\r\nfrom syncode import Grammar\r\nfrom transformers import AutoModelForCausalLM, AutoTokenizer\r\n\r\ndevice = 'cuda'\r\n# model_name = \"meta-llama/Llama-3.2-1B-Instruct\"\r\nmodel_name = \"meta-llama/Llama-3.1-8B-Instruct\"\r\ncache_dir = None\r\n\r\nmodel = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, cache_dir=cache_dir).eval().to(device)\r\ntokenizer = AutoTokenizer.from_pretrained(model_name)\r\n\r\n# Initialize SynCode logits processor for the given grammar\r\n\r\n# grammar_str = \"\"\" start: month \" \" day \r\n \r\n# day: /[1-9]/ | /[1-2][0-9]/ | /3[0-1]/\r\n \r\n# month: \"January\" | \"February\" | \"March\" | \"April\" | \"May\" | \"June\" | \"July\" | \"August\" | \"September\" | \"October\" | \"November\" | \"December\"\r\n# \"\"\"\r\ngrammar_str = \"python\"\r\n# grammar_str = \"go\"\r\n# grammar_str = \"java\"\r\n\r\ndate_grammar = Grammar(grammar_str)\r\nsyncode_logits_processor = SyncodeLogitsProcessor(grammar=date_grammar, tokenizer=tokenizer, parse_output_only=True)\r\n\r\nprompt = f\"Write a {grammar_str} function that prints 'hello world' in reverse.\"\r\nmessages = [{\"role\": \"user\", \"content\": prompt}]\r\nprompt = tokenizer.apply_chat_template(\r\n messages, tokenize=False, add_generation_prompt=True\r\n )\r\nprint(\"[PROMPT]\", prompt, \"\\n\")\r\n\r\nsyncode_logits_processor.reset(prompt)\r\n\r\ninputs = tokenizer(prompt, return_tensors='pt').input_ids.to(device)\r\n\r\nattention_mask = torch.ones_like(inputs)\r\noutput = model.generate(\r\n inputs,\r\n attention_mask=attention_mask,\r\n max_length=512, \r\n num_return_sequences=1, \r\n pad_token_id=tokenizer.eos_token_id, \r\n logits_processor=[syncode_logits_processor]\r\n )\r\noutput_str = tokenizer.decode(output[0][len(inputs[0]):], skip_special_tokens=True)\r\nprint(\"[OUTPUT]\", output_str)\r\n```\r\n\r\nPython\r\n````\r\n[PROMPT] \u003c|begin_of_text|\u003e\u003c|start_header_id|\u003esystem\u003c|end_header_id|\u003e\r\n\r\nCutting Knowledge Date: December 2023\r\nToday Date: 26 Jul 2024\r\n\r\n\u003c|eot_id|\u003e\u003c|start_header_id|\u003euser\u003c|end_header_id|\u003e\r\n\r\nWrite a python function that prints 'hello world' in reverse.\u003c|eot_id|\u003e\u003c|start_header_id|\u003eassistant\u003c|end_header_id|\u003e\r\n\r\n \r\n\r\n[OUTPUT] **Reversing 'Hello World' Function**\r\n=====================================\r\n\r\nHere is a simple Python function that prints 'hello world' in reverse:\r\n\r\n```python\r\ndef print_reverse_hello_world():\r\n \"\"\"\r\n Prints 'hello world' in reverse.\r\n \"\"\"\r\n message = \"hello world\"\r\n reversed_message = message[::-1]\r\n print(reversed_message)\r\n\r\nprint_reverse_hello_world()\r\n```\r\n\r\n**\r\n````\r\n\r\nJava\r\n````\r\n[PROMPT] \u003c|begin_of_text|\u003e\u003c|start_header_id|\u003esystem\u003c|end_header_id|\u003e\r\n\r\nCutting Knowledge Date: December 2023\r\nToday Date: 26 Jul 2024\r\n\r\n\u003c|eot_id|\u003e\u003c|start_header_id|\u003euser\u003c|end_header_id|\u003e\r\n\r\nWrite a java function that prints 'hello world' in reverse.\u003c|eot_id|\u003e\u003c|start_header_id|\u003eassistant\u003c|end_header_id|\u003e\r\n\r\n \r\n\r\n[OUTPUT] interface\r\nHere is a simple Java function that prints 'hello world' in reverse:\r\n\r\n```java\r\npublic class HelloWorld {\r\n public static void main(String[] args) {\r\n System.out.println(\"Hello World\");\r\n }\r\n}\r\n```\r\n\r\nExplanation:\r\n\r\n- The `System.out.println()` function is used to print the string \"Hello World\" to the console.\r\n- The `public static void\r\n````\r\n\r\nGo\r\n````\r\n[PROMPT] \u003c|begin_of_text|\u003e\u003c|start_header_id|\u003esystem\u003c|end_header_id|\u003e\r\n\r\nCutting Knowledge Date: December 2023\r\nToday Date: 26 Jul 2024\r\n\r\n\u003c|eot_id|\u003e\u003c|start_header_id|\u003euser\u003c|end_header_id|\u003e\r\n\r\nWrite a go function that prints 'hello world' in reverse.\u003c|eot_id|\u003e\u003c|start_header_id|\u003eassistant\u003c|end_header_id|\u003e\r\n\r\n \r\n\r\n[OUTPUT] \\ \r\n\\\r\n\r\n\\\r\n\r\n\\\r\n...\r\n````","author":{"url":"https://github.com/ivnle","@type":"Person","name":"ivnle"},"datePublished":"2024-12-19T22:44:42.000Z","interactionStatistic":{"@type":"InteractionCounter","interactionType":"https://schema.org/CommentAction","userInteractionCount":1},"url":"https://github.com/137/syncode/issues/137"}
| route-pattern | /_view_fragments/issues/show/:user_id/:repository/:id/issue_layout(.:format) |
| route-controller | voltron_issues_fragments |
| route-action | issue_layout |
| fetch-nonce | v2:2e126b27-13c8-293e-30a4-0968edfc826c |
| current-catalog-service-hash | 81bb79d38c15960b92d99bca9288a9108c7a47b18f2423d0f6438c5b7bcd2114 |
| request-id | BCFE:1F9D56:671011:83E773:6990A9E9 |
| html-safe-nonce | d4f874ba7adcaf7df1bbd716820e25b74c22f6e36984b67cd5376041cd31a219 |
| visitor-payload | eyJyZWZlcnJlciI6IiIsInJlcXVlc3RfaWQiOiJCQ0ZFOjFGOUQ1Njo2NzEwMTE6ODNFNzczOjY5OTBBOUU5IiwidmlzaXRvcl9pZCI6IjU5MDQ0NTg2MzA2OTUwMDQ2NDkiLCJyZWdpb25fZWRnZSI6ImlhZCIsInJlZ2lvbl9yZW5kZXIiOiJpYWQifQ== |
| visitor-hmac | 327513799a25591a77f57e0a37c6d82b78fdcaafa1705263829c759f1b59723e |
| hovercard-subject-tag | issue:2751560064 |
| github-keyboard-shortcuts | repository,issues,copilot |
| google-site-verification | Apib7-x98H0j5cPqHWwSMm6dNU4GmODRoqxLiDzdx9I |
| octolytics-url | https://collector.github.com/github/collect |
| analytics-location | / |
| fb:app_id | 1401488693436528 |
| apple-itunes-app | app-id=1477376905, app-argument=https://github.com/_view_fragments/issues/show/structuredllm/syncode/137/issue_layout |
| twitter:image | https://opengraph.githubassets.com/3705fa018bf4735600f128d52628aeec0a3622d978c8fd6532c62625555c166e/structuredllm/syncode/issues/137 |
| twitter:card | summary_large_image |
| og:image | https://opengraph.githubassets.com/3705fa018bf4735600f128d52628aeec0a3622d978c8fd6532c62625555c166e/structuredllm/syncode/issues/137 |
| og:image:alt | I am experiencing issues with the built-in grammars for Java, Python, and Go. The Python and Java grammars appear to be ignored, while the Go grammar produces mostly gibberish. Below is a script to... |
| og:image:width | 1200 |
| og:image:height | 600 |
| og:site_name | GitHub |
| og:type | object |
| og:author:username | ivnle |
| hostname | github.com |
| expected-hostname | github.com |
| None | 42c603b9d642c4a9065a51770f75e5e27132fef0e858607f5c9cb7e422831a7b |
| turbo-cache-control | no-preview |
| go-import | github.com/structuredllm/syncode git https://github.com/structuredllm/syncode.git |
| octolytics-dimension-user_id | 204232273 |
| octolytics-dimension-user_login | structuredllm |
| octolytics-dimension-repository_id | 687211074 |
| octolytics-dimension-repository_nwo | structuredllm/syncode |
| octolytics-dimension-repository_public | true |
| octolytics-dimension-repository_is_fork | false |
| octolytics-dimension-repository_network_root_id | 687211074 |
| octolytics-dimension-repository_network_root_nwo | structuredllm/syncode |
| turbo-body-classes | logged-out env-production page-responsive |
| disable-turbo | false |
| browser-stats-url | https://api.github.com/_private/browser/stats |
| browser-errors-url | https://api.github.com/_private/browser/errors |
| release | 3b33c5aedc9808f45bc5fcf0b1e4404cf749dac7 |
| ui-target | full |
| theme-color | #1e2327 |
| color-scheme | light dark |
Links:
Viewport: width=device-width