Snowflake/Arctic-Text2SQL-R1-7B Fine-tuned for NL2SQL++ v8
This model is a fine-tuned version of Snowflake/Arctic-Text2SQL-R1-7B on the NL2SQL++ v8 dataset with code-with-thought reasoning.
Model Details
- Base Model: Snowflake/Arctic-Text2SQL-R1-7B
- Task: Text-to-SQL generation
- Dataset: NL2SQL++ v8 with code-with-thought reasoning
- Fine-tuning Method: LoRA (Low-Rank Adaptation) with Unsloth
- Quantization: 16-bit merged weights
- Training Dataset Size: (1358, 1) examples
- Validation Dataset Size: (0, 0) examples
Training Configuration
- output_dir: trainer_output
- overwrite_output_dir: False
- do_train: False
- do_eval: False
- do_predict: False
- eval_strategy: IntervalStrategy.NO
- prediction_loss_only: False
- per_device_train_batch_size: 8
- per_device_eval_batch_size: 8
- per_gpu_train_batch_size: None
- per_gpu_eval_batch_size: None
- gradient_accumulation_steps: 8
- eval_accumulation_steps: 10
- eval_delay: 0
- torch_empty_cache_steps: None
- learning_rate: 0.0002
- weight_decay: 0.01
- adam_beta1: 0.9
- adam_beta2: 0.999
- adam_epsilon: 1e-08
- max_grad_norm: 1.0
- num_train_epochs: 3.0
- max_steps: -1
- lr_scheduler_type: SchedulerType.COSINE
- lr_scheduler_kwargs: None
- warmup_ratio: 0.1
- warmup_steps: 0
- log_level: passive
- log_level_replica: warning
- log_on_each_node: True
- logging_dir: trainer_output/runs/Feb03_10-21-41_ip-172-31-46-165.ap-southeast-2.compute.internal
- logging_strategy: IntervalStrategy.STEPS
- logging_first_step: False
- logging_steps: 0.004
- logging_nan_inf_filter: True
- save_strategy: SaveStrategy.NO
- save_steps: 0.04
- save_total_limit: 2
- save_safetensors: True
- save_on_each_node: False
- save_only_model: False
- restore_callback_states_from_checkpoint: False
- no_cuda: False
- use_cpu: False
- use_mps_device: False
- seed: 3407
- data_seed: None
- jit_mode_eval: False
- bf16: True
- fp16: False
- fp16_opt_level: O1
- half_precision_backend: auto
- bf16_full_eval: False
- fp16_full_eval: False
- tf32: None
- local_rank: 0
- ddp_backend: None
- tpu_num_cores: None
- tpu_metrics_debug: False
- debug: []
- dataloader_drop_last: False
- eval_steps: 0.04
- dataloader_num_workers: 0
- dataloader_prefetch_factor: None
- past_index: -1
- run_name: None
- disable_tqdm: False
- remove_unused_columns: True
- label_names: None
- load_best_model_at_end: True
- metric_for_best_model: loss
- greater_is_better: False
- ignore_data_skip: False
- fsdp: []
- fsdp_min_num_params: 0
- fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- fsdp_transformer_layer_cls_to_wrap: None
- accelerator_config: AcceleratorConfig(split_batches=False, dispatch_batches=None, even_batches=True, use_seedable_sampler=True, non_blocking=False, gradient_accumulation_kwargs=None, use_configured_state=False)
- parallelism_config: None
- deepspeed: None
- label_smoothing_factor: 0.0
- optim: OptimizerNames.PAGED_ADAMW
- optim_args: None
- adafactor: False
- group_by_length: False
- length_column_name: length
- report_to: ['wandb']
- project: huggingface
- trackio_space_id: trackio
- ddp_find_unused_parameters: None
- ddp_bucket_cap_mb: None
- ddp_broadcast_buffers: None
- dataloader_pin_memory: True
- dataloader_persistent_workers: False
- skip_memory_metrics: True
- use_legacy_prediction_loop: False
- push_to_hub: False
- resume_from_checkpoint: None
- hub_model_id: None
- hub_strategy: HubStrategy.EVERY_SAVE
- hub_token: None
- hub_private_repo: None
- hub_always_push: False
- hub_revision: None
- gradient_checkpointing: True
- gradient_checkpointing_kwargs: None
- include_inputs_for_metrics: False
- include_for_metrics: []
- eval_do_concat_batches: True
- fp16_backend: auto
- push_to_hub_model_id: None
- push_to_hub_organization: None
- push_to_hub_token: None
- _n_gpu: 1
- mp_parameters:
- auto_find_batch_size: False
- full_determinism: False
- torchdynamo: None
- ray_scope: last
- ddp_timeout: 1800
- torch_compile: False
- torch_compile_backend: None
- torch_compile_mode: None
- include_tokens_per_second: False
- include_num_input_tokens_seen: no
- neftune_noise_alpha: None
- optim_target_modules: None
- batch_eval_metrics: False
- eval_on_start: False
- use_liger_kernel: False
- liger_kernel_config: None
- eval_use_gather_object: False
- average_tokens_across_devices: True
- model_init_kwargs: None
- chat_template_path: None
- dataset_text_field: text
- dataset_kwargs: None
- dataset_num_proc: None
- eos_token: None
- pad_token: None
- max_length: 1024
- packing: False
- packing_strategy: bfd
- padding_free: False
- pad_to_multiple_of: None
- eval_packing: None
- completion_only_loss: None
- assistant_only_loss: False
- loss_type: nll
- activation_offloading: False
- vllm_sampling_params: None
- unsloth_num_chunks: -1
- unsloth_logit_chunk_multiplier: None
- unsloth_grpo_mini_batch: None
- max_seq_length: 15000
- model_name: Snowflake/Arctic-Text2SQL-R1-7B
- train_batch_size: 8
- val_batch_size: 1
- num_epochs: 2.5
- lora_r: 64
- lora_alpha: 128
Train Dataset Example
<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
You are an expert in SQL++ query generation. You will be given a document schema and a natural language query. You need to generate a valid SQL++ query equivalent to the natural language query.
Bucket Name: `training_bucket`
Scope Name: `Educational_Learning_Management_Ecosystem_scope`
Use the given document schema to generate the SQL++ query.
Document Schema:
{"`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`notifications`": {"Flavor": "", "properties": {"channel": {"samples": ["email"], "type": "string"}, "created_at": {"samples": ["2025-11-07T01:32:43.289137+00:..."], "type": "string"}, "data": {"properties": {"assignment_id": {"samples": ["assignment_113d1c3f-0344-4ce8-..."], "type": "string"}, "course_id": {"samples": ["course_49eebebc-18d9-4399-b816..."], "type": "string"}, "discussion_id": {"samples": ["discussion_0ddfbd79-5abd-4e0a-..."], "type": "string"}, "due_at": {"samples": [["2025-01-31T07:45:11.761532+00:..."]], "type": "string"}, "event_id": {"samples": ["event_4763a9f6-c0b3-44f4-827f-..."], "type": "string"}, "item_id": {"samples": ["assignment_0b55447b-cc47-468b-..."], "type": "string"}, "item_type": {"samples": ["assignment"], "type": "string"}, "post_id": {"samples": ["post_2e477677-7dd8-4275-9651-7..."], "type": "string"}, "score": {"samples": [[5.81]], "type": "number"}, "start_at": {"samples": ["2025-10-29T19:19:27.216807+00:..."], "type": "string"}, "title": {"samples": ["End source environmental"], "type": "string"}}, "type": "object"}, "dedupe_key": {"samples": [["4609b204-89ea-4641-a4df-318e7e..."]], "type": "string"}, "expires_at": {"samples": [["2025-10-27T13:09:34.291981+00:..."]], "type": "string"}, "message": {"samples": ["Answer shoulder guess station ..."], "type": "string"}, "priority": {"samples": ["high"], "type": "string"}, "read_at": {"samples": [["2026-01-11T19:01:08.316551+00:..."]], "type": "string"}, "sent_at": {"samples": [["2025-10-16T14:26:34.291981+00:..."]], "type": "string"}, "status": {"samples": ["failed"], "type": "string"}, "tenant_id": {"samples": ["tenant_002"], "type": "string"}, "title": {"samples": ["Chance box safe anything groun..."], "type": "string"}, "type": {"samples": ["announcement"], "type": "string"}, "updated_at": {"samples": ["2026-01-21T15:29:23.881+05:30"], "type": "string"}, "user_id": {"samples": ["user_090925b5-6741-47de-980e-3..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`events`": {"Flavor": "", "properties": {"audience": {"properties": {"target": {"samples": ["all"], "type": "string"}, "user_ids": {"items": {}, "samples": [["user_70ab58d9-8841-4775-9264-2bc5f23b4c91"]], "type": "array"}}, "type": "object"}, "course_id": {"samples": [["course_4aea616c-67c0-402a-bc87..."]], "type": "string"}, "created_at": {"samples": ["2025-01-27T22:17:04.155691+00:..."], "type": "string"}, "created_by_user_id": {"samples": ["user_3cc06b12-1b48-448a-b543-f..."], "type": "string"}, "description": {"samples": [["Building off section thought f..."]], "type": "string"}, "end_at": {"samples": [["2025-10-25T04:49:27.085555+00:..."]], "type": "string"}, "location": {"samples": [["505 Miller Field Apt. 387"]], "type": "string"}, "online_meeting": {"properties": {"join_url": {"samples": ["http://mccarthy.com/"], "type": "string"}, "meeting_id": {"samples": [["119831713"]], "type": "string"}, "provider": {"samples": [["meet"]], "type": "string"}}, "type": "object"}, "recurrence": {"properties": {"rrule": {"samples": ["FREQ=WEEKLY;BYDAY=MO,WE"], "type": "string"}, "until": {"samples": ["2025-12-20T14:19:27.144623+00:..."], "type": "string"}}, "type": "object"}, "section_id": {"samples": [["section_4c46e636-6448-4a2d-8f9..."]], "type": "string"}, "start_at": {"samples": ["2025-10-25T14:19:27.144623+00:..."], "type": "string"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "timezone": {"samples": [["America/Chicago"]], "type": "string"}, "title": {"samples": ["Newspaper throughout black"], "type": "string"}, "type": {"samples": ["assignment_due"], "type": "string"}, "updated_at": {"samples": ["2025-02-23T00:32:47.155691+00:..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`discussion_posts`": {"Flavor": "", "properties": {"attachments": {"items": {"properties": {"display_name": {"samples": ["but.txt"], "type": "string"}, "file_id": {"samples": ["file_0c87cb96-e3c4-4575-90d1-1..."], "type": "string"}}, "type": "object"}, "samples": [[{"display_name": "court.js", "file_id": "file_a67720ba-5d58-4fd7-8f1b-bce212add500"}]], "type": "array"}, "body": {"samples": ["Course can into raise against ..."], "type": "string"}, "course_id": {"samples": ["course_3f9a9b28-65ee-4537-b43f..."], "type": "string"}, "created_at": {"samples": ["2025-07-22T02:43:13.029530+00:..."], "type": "string"}, "deleted_at": {"samples": [["2025-10-03T20:05:35.036525+00:..."]], "type": "string"}, "discussion_id": {"samples": ["discussion_1df7e7c6-0697-4b2f-..."], "type": "string"}, "edited_at": {"samples": [["2025-07-26T20:27:51.047769+00:..."]], "type": "string"}, "parent_post_id": {"samples": [["post_29739d5f-1f05-4d62-8562-f..."]], "type": "string"}, "reactions": {"properties": {"counts": {"properties": {"likes": {"samples": [0], "type": "number"}}, "type": "object"}, "like_user_ids": {"items": {"type": "string"}, "samples": [["user_42f332b6-eac9-4fd0-aa96-a90e441a9f85"]], "type": "array"}}, "type": "object"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "updated_at": {"samples": ["2025-08-17T08:11:03.029530+00:..."], "type": "string"}, "user_id": {"samples": ["user_1034db05-065b-4890-a2dd-e..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`discussions`": {"Flavor": "", "properties": {"availability": {"properties": {"close_at": {"samples": [["2025-04-19T21:24:57.977204+00:..."]], "type": "string"}, "open_at": {"samples": [["2025-03-26T21:24:57.977204+00:..."]], "type": "string"}}, "type": "object"}, "course_id": {"samples": ["course_3b923cb7-8af6-41cc-aa00..."], "type": "string"}, "created_at": {"samples": ["2025-03-14T21:24:57.977204+00:..."], "type": "string"}, "created_by_user_id": {"samples": ["user_1ad530e4-bddf-4d7f-80ec-c..."], "type": "string"}, "module_id": {"samples": [["module_916c1fa7-86e5-47e1-9c0e..."]], "type": "string"}, "prompt": {"samples": [["Available wish center religiou..."]], "type": "string"}, "published": {"samples": [false], "type": "boolean"}, "settings": {"properties": {"allow_anonymous": {"samples": [false], "type": "boolean"}, "locked": {"samples": [false], "type": "boolean"}, "pinned": {"samples": [false], "type": "boolean"}, "require_post_before_view": {"samples": [false], "type": "boolean"}, "threaded": {"samples": [false], "type": "boolean"}}, "type": "object"}, "tenant_id": {"samples": ["tenant_003"], "type": "string"}, "title": {"samples": ["Conference institution apply w..."], "type": "string"}, "type": {"samples": ["announcement"], "type": "string"}, "updated_at": {"samples": ["2025-04-01T21:18:17.977204+00:..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`gradebook_entries`": {"Flavor": "", "properties": {"calculation": {"properties": {"override": {"properties": {"reason": {"samples": [["accommodation"]], "type": "string"}, "score": {"samples": [[0.66]], "type": "number"}, "set_at": {"samples": [["2025-11-19T05:43:41.930507+00:..."]], "type": "string"}, "set_by_user_id": {"samples": [["user_2eda6c55-c904-42b2-b3b6-8..."]], "type": "string"}}, "type": "object"}, "source_doc": {"properties": {"collection": {"samples": ["quiz_attempts"], "type": "string"}, "doc_id": {"samples": ["quiz_attempt_626f89b9-13e0-4cd..."], "type": "string"}}, "type": "object"}}, "type": "object"}, "course_id": {"samples": ["course_17f65603-c8fa-4d67-b896..."], "type": "string"}, "created_at": {"samples": ["2025-02-28T14:09:01.930116+00:..."], "type": "string"}, "excluded": {"samples": [false], "type": "boolean"}, "grade": {"samples": [["40.5%"]], "type": "string"}, "graded_at": {"samples": [["2025-09-23T17:29:31.775201+00:..."]], "type": "string"}, "grader_user_id": {"samples": [["user_452f12b5-d349-470e-8ade-7..."]], "type": "string"}, "item_id": {"samples": ["assignment_1e2ad42e-3633-4498-..."], "type": "string"}, "item_type": {"samples": ["assignment"], "type": "string"}, "points_possible": {"samples": [16], "type": "number"}, "score": {"samples": [[0.66]], "type": "number"}, "status": {"samples": ["excused"], "type": "string"}, "submitted_at": {"samples": [["2025-08-30T09:32:28.725463+00:..."]], "type": "string"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "updated_at": {"samples": ["2025-03-22T02:56:01.930116+00:..."], "type": "string"}, "user_id": {"samples": ["user_38a241e3-c8b0-4fe9-9431-8..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`quiz_attempts`": {"Flavor": "", "properties": {"attempt": {"samples": [1], "type": "number"}, "course_id": {"samples": ["course_62a966c3-fb86-4b24-b8cf..."], "type": "string"}, "created_at": {"samples": ["2025-11-03T16:32:03.836473+00:..."], "type": "string"}, "proctoring": {"properties": {"flags": {"items": {"type": "string"}, "samples": [["audio_detected"]], "type": "array"}, "provider": {"samples": [["internal"]], "type": "string"}, "session_id": {"samples": [["6ab06124-bdf1-4770-b061-c98f8d..."]], "type": "string"}}, "type": "object"}, "quiz_id": {"samples": ["quiz_16aadf85-c5c7-4b17-bfe8-d..."], "type": "string"}, "responses": {"items": {"Flavor": "", "properties": {"answer": {"properties": {"text": {"samples": ["Fire actually human control th..."], "type": "string"}, "value": {"properties": {"choice_id": {"samples": ["choice_61e1ff44-79b7-407c-9f51..."], "type": "string"}, "value": {"samples": [false], "type": "boolean"}}, "type": "object"}}, "type": "object"}, "is_correct": {"samples": [[false]], "type": "boolean"}, "question_id": {"samples": ["q_48a670ac-a0ac-4b9a-96fb-2b80..."], "type": "string"}, "score": {"samples": [0.09], "type": "number"}}, "type": "object"}, "samples": [[{"answer": {"text": "Anything like pick letter bad guess edge goal responsibility."}, "is_correct": null, "question_id": "q_11a12c47-c8c1-4c6b-8108-6944b5cf2d72", "score": 0.87}]], "type": "array"}, "result": {"properties": {"graded_at": {"samples": [["2025-09-18T01:57:12.808321+00:..."]], "type": "string"}, "points_possible": {"samples": [[10]], "type": "number"}, "score": {"samples": [[0]], "type": "number"}}, "type": "object"}, "started_at": {"samples": ["2025-11-03T16:32:03.836473+00:..."], "type": "string"}, "status": {"samples": ["abandoned"], "type": "string"}, "submitted_at": {"samples": [["2025-11-03T16:37:36.836473+00:..."]], "type": "string"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "time_taken_seconds": {"samples": [[281]], "type": "number"}, "updated_at": {"samples": ["2025-09-20T22:05:56.808321+00:..."], "type": "string"}, "user_id": {"samples": ["user_27103376-daba-4cf2-a5ff-9..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`quizzes`": {"Flavor": "", "properties": {"attempts_allowed": {"samples": [[1]], "type": "number"}, "availability": {"properties": {"close_at": {"samples": [["2025-01-22T08:56:35.108079+00:..."]], "type": "string"}, "open_at": {"samples": [["2025-01-11T08:56:35.108079+00:..."]], "type": "string"}}, "type": "object"}, "course_id": {"samples": ["course_1f8900fd-6a01-4de8-8b39..."], "type": "string"}, "created_at": {"samples": ["2025-03-28T15:12:56.102541+00:..."], "type": "string"}, "created_by_user_id": {"samples": ["user_0f4f2da9-7be0-4450-8d29-7..."], "type": "string"}, "description": {"samples": [["Add activity despite family. O..."]], "type": "string"}, "module_id": {"samples": [["module_40ebf914-6fa9-467f-b640..."]], "type": "string"}, "published": {"samples": [false], "type": "boolean"}, "questions": {"items": {"Flavor": "", "properties": {"choices": {"items": {"Flavor": "", "properties": {"choice_id": {"type": "string"}, "is_correct": {"type": "null"}, "text": {"type": "string"}}, "type": "object"}, "samples": [[[{"choice_id": "choice_729f63fc-a5f0-4111-8527-66275da726af", "is_correct": null, "text": "True"}]]], "type": "array"}, "correct_answer": {"properties": {"choice_id": {"samples": ["choice_104ec635-0392-4955-ab84..."], "type": "string"}, "value": {"samples": [false], "type": "boolean"}}, "type": "object"}, "feedback": {"properties": {"correct": {"samples": [], "type": "null"}, "incorrect": {"samples": ["Agreement camera anything acco..."], "type": "string"}}, "samples": [[{"correct": null, "incorrect": "Agreement camera anything according catch."}]], "type": ["object", "object", "object", "object"]}, "points": {"samples": [1], "type": "number"}, "prompt": {"samples": ["Affect tree generation low not..."], "type": "string"}, "question_id": {"samples": ["q_2050decb-ee26-43b9-961b-70e6..."], "type": "string"}, "type": {"samples": ["multiple_choice"], "type": "string"}}, "type": "object"}, "samples": [[{"choices": [{"choice_id": "choice_271a3ef4-4870-4c6e-a9ef-8ae43fadbc17", "is_correct": false, "text": "Admit"}], "correct_answer": {"choice_id": "choice_5fb5204c-46be-4190-b4d4-c828db3e2c81"}, "feedback": null, "points": 1, "prompt": "Blue few enter ago free baby behavior include heart side often?", "question_id": "q_7cdfa184-c03b-4bff-acf6-33f24b0cf844", "type": "multiple_choice"}]], "type": "array"}, "quiz_type": {"samples": ["graded"], "type": "string"}, "shuffle_answers": {"samples": [false], "type": "boolean"}, "tenant_id": {"samples": ["tenant_002"], "type": "string"}, "time_limit_minutes": {"samples": [[10]], "type": "number"}, "title": {"samples": ["Best nation tend provide"], "type": "string"}, "updated_at": {"samples": ["2025-04-24T20:48:41.102541+00:..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`submissions`": {"Flavor": "`status` = \"needs_review\", `updated_at` = \"2026-01-21T15:30:07.037+05:30\"", "properties": {"assignment_id": {"samples": ["assignment_15f089e8-2ed4-469a-..."], "type": "string"}, "attempt": {"samples": [1], "type": "number"}, "content": {"properties": {"attachments": {"items": {}, "samples": [[{"content_type": "application/pdf", "display_name": "arrive.css", "file_id": "file_81dee26b-5d23-4073-a92e-e563ed86810c", "size_bytes": 24867691}]], "type": "array"}, "text": {"samples": [["Fine bank car scientist messag..."]], "type": "string"}, "url": {"samples": [["http://armstrong.com/"]], "type": "string"}}, "type": "object"}, "course_id": {"samples": ["course_2ec0d73d-b399-4d00-a698..."], "type": "string"}, "created_at": {"samples": ["2025-07-17T08:36:54.785128+00:..."], "type": "string"}, "grade": {"properties": {"feedback": {"samples": [["Dream make especially sure sto..."]], "type": "string"}, "grade": {"samples": [["18.27"]], "type": "string"}, "graded_at": {"samples": [["2025-09-25T17:21:22.746877+00:..."]], "type": "string"}, "grader_user_id": {"samples": [["user_050e1825-5579-40ba-9244-8..."]], "type": "string"}, "rubric_assessment": {"items": {"Flavor": "", "properties": {"comment": {"samples": [], "type": "string"}, "criterion_id": {"type": "string"}, "points": {"type": "number"}, "rating_id": {"type": "string"}}, "type": "object"}, "samples": [[[{"comment": null, "criterion_id": "crit_260f3dcd-a502-43f0-a3e4-89f63e721bb6", "points": 10, "rating_id": "rate_eabc48d3-8cff-408c-b386-824d902e56e1"}]]], "type": "array"}, "score": {"samples": [[12.95]], "type": "number"}}, "type": "object"}, "integrity": {"properties": {"report_url": {"samples": [["http://howard.info/"]], "type": "string"}, "similarity_score": {"samples": [[0.09]], "type": "number"}}, "type": "object"}, "late": {"samples": [false], "type": "boolean"}, "status": {"samples": ["needs_review"], "type": "string"}, "submitted_at": {"samples": [["2025-07-17T16:36:54.785128+00:..."]], "type": "string"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "updated_at": {"samples": ["2026-01-21T15:30:07.037+05:30"], "type": "string"}, "user_id": {"samples": ["user_01507e1d-5a3b-4262-896e-c..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`assignments`": {"Flavor": "", "properties": {"course_id": {"samples": ["course_17367615-c9ae-4370-84a0..."], "type": "string"}, "created_at": {"samples": ["2025-01-27T02:27:55.847603+00:..."], "type": "string"}, "created_by_user_id": {"samples": ["user_1c4f5413-effd-4cd3-a4a1-a..."], "type": "string"}, "description": {"samples": [["Art concern prepare note."]], "type": "string"}, "due": {"properties": {"available_at": {"samples": [["2025-04-12T05:53:29.748351+00:..."]], "type": "string"}, "due_at": {"samples": [["2025-02-13T19:27:55.847603+00:..."]], "type": "string"}, "until_at": {"samples": [["2025-04-14T16:23:05.729169+00:..."]], "type": "string"}}, "type": "object"}, "grading_type": {"samples": ["letter"], "type": "string"}, "group": {"properties": {"group_set_id": {"samples": [["groupset_1b2cc073-037f-4adf-b0..."]], "type": "string"}, "is_group_assignment": {"samples": [false], "type": "boolean"}}, "type": "object"}, "module_id": {"samples": [["module_20391da2-9b83-419c-b403..."]], "type": "string"}, "points_possible": {"samples": [10], "type": "number"}, "published": {"samples": [false], "type": "boolean"}, "rubric": {"properties": {"criteria": {"items": {"properties": {"criterion_id": {"type": "string"}, "description": {"type": "string"}, "points": {"type": "number"}, "ratings": {"items": {"properties": {"description": {"type": "string"}, "points": {"type": "number"}, "rating_id": {"type": "string"}}, "type": "object"}, "type": "array"}}, "type": "object"}, "samples": [[{"criterion_id": "crit_029fd83a-484f-434e-92ca-e6bb649a9aeb", "description": "Choice over difference follow", "points": 3.3, "ratings": [{"description": "Excellent", "points": 0, "rating_id": "rate_d0b2bd0a-624f-4e05-a1a8-94072555078d"}]}]], "type": "array"}}, "type": "object"}, "settings": {"properties": {"allowed_attempts": {"samples": [[1]], "type": "number"}, "plagiarism_check": {"properties": {"enabled": {"samples": [false], "type": "boolean"}, "provider": {"samples": [["copyleaks"]], "type": "string"}}, "type": "object"}}, "type": "object"}, "submission_types": {"items": {"type": "string"}, "samples": [["discussion"]], "type": "array"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "title": {"samples": ["Especially idea than garden us..."], "type": "string"}, "type": {"samples": ["assignment"], "type": "string"}, "updated_at": {"samples": ["2025-02-12T06:54:11.847603+00:..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`content_items`": {"Flavor": "", "properties": {"attachments": {"items": {"Flavor": "", "properties": {"content_type": {"samples": [["application/pdf"]], "type": "string"}, "display_name": {"samples": ["art.js"], "type": "string"}, "file_id": {"samples": ["file_0406d594-37cc-4d3a-b43f-f..."], "type": "string"}, "size_bytes": {"samples": [[41224851]], "type": "number"}}, "type": "object"}, "samples": [[{"content_type": "application/pdf", "display_name": "teacher.txt", "file_id": "file_260fd213-efe5-4418-bf1a-f16dae9d30dd", "size_bytes": 12490685}]], "type": "array"}, "body": {"samples": [["Everything hot right least ref..."]], "type": "string"}, "course_id": {"samples": ["course_1a6dc071-4351-442b-98df..."], "type": "string"}, "created_at": {"samples": ["2025-01-16T06:25:22.580910+00:..."], "type": "string"}, "created_by_user_id": {"samples": ["user_3e391859-db89-4dfe-b3a3-e..."], "type": "string"}, "html": {"samples": [["<p>Determine mean yourself bil..."]], "type": "string"}, "links": {"items": {"type": "string"}, "samples": [["http://marshall-evans.com/"]], "type": "array"}, "module_id": {"samples": [["module_03c3b849-7320-4f92-a2f4..."]], "type": "string"}, "published": {"samples": [false], "type": "boolean"}, "tenant_id": {"samples": ["tenant_002"], "type": "string"}, "title": {"samples": ["Arrive by film also along"], "type": "string"}, "type": {"samples": ["embed"], "type": "string"}, "updated_at": {"samples": ["2025-01-22T19:52:48.580910+00:..."], "type": "string"}, "version": {"samples": [1], "type": "number"}, "visibility": {"samples": ["course"], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`modules`": {"Flavor": "", "properties": {"availability": {"properties": {"lock_at": {"samples": [["2024-11-16T12:56:39.425758+00:..."]], "type": "string"}, "unlock_at": {"samples": [["2024-10-01T06:29:08.481035+00:..."]], "type": "string"}}, "type": "object"}, "course_id": {"samples": ["course_36f0ca81-f668-4bff-8ffc..."], "type": "string"}, "created_at": {"samples": ["2024-09-29T06:29:08.481035+00:..."], "type": "string"}, "description": {"samples": [["Deal activity opportunity fact..."]], "type": "string"}, "items": {"items": {"properties": {"item_id": {"samples": ["moditem_589b54ad-23a5-4c81-9ec..."], "type": "string"}, "position": {"samples": [1], "type": "number"}, "resource_ref": {"properties": {"collection": {"samples": ["assignments"], "type": "string"}, "doc_id": {"samples": ["assignment_42016f18-2398-4f69-..."], "type": "string"}}, "type": "object"}, "title": {"samples": ["Make marriage arm chance work ..."], "type": "string"}, "type": {"samples": ["assignment"], "type": "string"}}, "type": "object"}, "samples": [[{"item_id": "moditem_15cda551-142d-46e7-94fb-f1b2f3ea4d55", "position": 1, "resource_ref": {"collection": "quizzes", "doc_id": "quiz_05261cf1-807b-41a2-a8f7-2e194ca4a144"}, "title": "Answer close enough chair again", "type": "quiz"}]], "type": "array"}, "position": {"samples": [1], "type": "number"}, "published": {"samples": [false], "type": "boolean"}, "requirements": {"items": {"Flavor": "", "properties": {"min_score": {"samples": [[5.7]], "type": "number"}, "target_id": {"samples": ["assignment_87764d82-0caa-47cc-..."], "type": "string"}, "target_type": {"samples": ["assignment"], "type": "string"}, "type": {"samples": ["min_score"], "type": "string"}}, "type": "object"}, "samples": [[{"min_score": null, "target_id": "assignment_5bb34c45-f30c-4bcf-ab85-93f247b8b8ac", "target_type": "assignment", "type": "must_submit"}]], "type": "array"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "title": {"samples": ["Module 12: Station Manage"], "type": "string"}, "updated_at": {"samples": ["2024-10-14T12:23:31.481035+00:..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`enrollments`": {"Flavor": "", "properties": {"audit": {"properties": {"import_batch_id": {"samples": [["batch_0fa3951f-ad4a-444c-8dfa-..."]], "type": "string"}, "source": {"samples": [["api"]], "type": "string"}}, "type": "object"}, "completion": {"properties": {"completed_at": {"samples": [["2025-09-07T13:12:03.720003+00:..."]], "type": "string"}, "progress_percent": {"samples": [0.6000000000000001], "type": "number"}}, "type": "object"}, "course_id": {"samples": ["course_0285ea20-fdae-4d2c-9f87..."], "type": "string"}, "dropped_at": {"samples": [["2025-04-03T06:20:52.654260+00:..."]], "type": "string"}, "enrolled_at": {"samples": ["2025-07-15T20:51:56.659410+00:..."], "type": "string"}, "role": {"samples": ["instructor"], "type": "string"}, "section_id": {"samples": [["section_58e0de81-f4c5-4a45-87f..."]], "type": "string"}, "status": {"samples": ["active"], "type": "string"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "user_id": {"samples": ["user_11467fe6-8b1d-4fa9-8d93-5..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`sections`": {"Flavor": "", "properties": {"capacity": {"samples": [[18]], "type": "number"}, "course_id": {"samples": ["course_3ba6dbda-e1de-41be-9a60..."], "type": "string"}, "created_at": {"samples": ["2024-02-18T04:06:28.338885+00:..."], "type": "string"}, "instructor_user_ids": {"items": {"type": "string"}, "samples": [["user_068b0627-2272-4d94-9d67-fc22df24280f"]], "type": "array"}, "name": {"samples": ["Section 1"], "type": "string"}, "schedule": {"properties": {"end_time": {"samples": [["10:50"]], "type": "string"}, "location": {"samples": [["159 Sarah Mission"]], "type": "string"}, "meeting_days": {"items": {"type": "string"}, "samples": [["Mon"]], "type": "array"}, "start_time": {"samples": [["09:00"]], "type": "string"}, "timezone": {"samples": [["America/Chicago"]], "type": "string"}}, "type": "object"}, "sis_section_id": {"samples": [["17930"]], "type": "string"}, "status": {"samples": ["active"], "type": "string"}, "ta_user_ids": {"items": {"type": "string"}, "samples": [["user_30ecbc9e-7850-4903-a223-8d99c3451e29"]], "type": "array"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "updated_at": {"samples": ["2024-03-01T21:15:24.338885+00:..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`courses`": {"Flavor": "", "properties": {"catalog": {"properties": {"categories": {"items": {"type": "string"}, "samples": [["Business"]], "type": "array"}, "is_listed": {"samples": [false], "type": "boolean"}, "tags": {"items": {"type": "string"}, "samples": [["affect"]], "type": "array"}}, "type": "object"}, "code": {"samples": ["BUS-375"], "type": "string"}, "created_at": {"samples": ["2023-12-06T14:19:02.316417+00:..."], "type": "string"}, "credits": {"samples": [[0.7000000000000001]], "type": "number"}, "description": {"samples": [["Long senior tell method entire..."]], "type": "string"}, "grading": {"properties": {"late_policy": {"properties": {"deduction_percent_per_day": {"samples": [[1.6]], "type": "number"}, "enabled": {"samples": [false], "type": "boolean"}, "max_deduction_percent": {"samples": [[17]], "type": "number"}}, "type": "object"}, "pass_mark": {"samples": [[53.5]], "type": "number"}, "scheme": {"samples": ["letter"], "type": "string"}}, "type": "object"}, "language": {"samples": [["de"]], "type": "string"}, "org_id": {"samples": ["org_50b2c3af-475b-4d87-954d-c0..."], "type": "string"}, "owner_user_id": {"samples": ["user_33b8af1b-3d69-4eec-8d3f-4..."], "type": "string"}, "status": {"samples": ["active"], "type": "string"}, "tenant_id": {"samples": ["tenant_001"], "type": "string"}, "term": {"properties": {"end_date": {"samples": [["2025-10-09T00:00:00+00:00"]], "type": "string"}, "start_date": {"samples": [["2025-07-28T00:00:00+00:00"]], "type": "string"}, "term_id": {"samples": [["term_2f868ccb-dd44-46c9-9c2b-f..."]], "type": "string"}}, "type": "object"}, "title": {"samples": ["Above significant road word ad..."], "type": "string"}, "updated_at": {"samples": ["2023-12-17T21:31:09.316417+00:..."], "type": "string"}, "visibility": {"samples": ["org"], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`organizations`": {"Flavor": "", "properties": {"branding": {"properties": {"logo_url": {"samples": [["https://dummyimage.com/192x931"]], "type": "string"}, "primary_color": {"samples": [["#578F57"]], "type": "string"}, "support_email": {"samples": [["support@adkins-meyer.info"]], "type": "string"}}, "type": "object"}, "created_at": {"samples": ["2023-12-25T16:15:40.733218+00:..."], "type": "string"}, "name": {"samples": ["Davis, Sanchez and Rivas Schoo..."], "type": "string"}, "status": {"samples": ["active"], "type": "string"}, "tenant_id": {"samples": ["tenant_004"], "type": "string"}, "terms": {"properties": {"academic_year": {"samples": [["2022-2023"]], "type": "string"}, "current_term_id": {"samples": [["term_5e24eec5-c291-4889-b9bf-e..."]], "type": "string"}}, "type": "object"}, "type": {"samples": ["corporate"], "type": "string"}, "updated_at": {"samples": ["2024-01-05T22:23:57.733218+00:..."], "type": "string"}}, "type": "object"}, "`training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`users`": {"Flavor": "", "properties": {"created_at": {"samples": ["2024-04-30T18:54:35.941025+00:..."], "type": "string"}, "email": {"samples": ["amy.small2@example.edu"], "type": "string"}, "last_login_at": {"samples": [["2023-09-28T05:27:43.039995+00:..."]], "type": "string"}, "locale": {"samples": [["en-US"]], "type": "string"}, "name": {"properties": {"family": {"samples": ["Gomez"], "type": "string"}, "given": {"samples": ["Amy"], "type": "string"}, "middle": {"samples": [["Jeffrey"]], "type": "string"}, "preferred": {"samples": [["Cynthia"]], "type": "string"}}, "type": "object"}, "phone": {"samples": [["+1-792-873-3409"]], "type": "string"}, "preferences": {"properties": {"digest_frequency": {"samples": [["daily"]], "type": "string"}, "email_notifications": {"samples": [false], "type": "boolean"}, "push_notifications": {"samples": [false], "type": "boolean"}}, "type": "object"}, "profile": {"properties": {"avatar_url": {"samples": [["https://dummyimage.com/317x488"]], "type": "string"}, "bio": {"samples": [["Organization western cultural ..."]], "type": "string"}, "external_ids": {"properties": {"lti_subject": {"samples": [["abbce072-75b9-4349-ab6d-08f84f..."]], "type": "string"}, "oauth_sub": {"samples": [["2edf5624-44bc-446f-93d3-e662af..."]], "type": "string"}, "sis_user_id": {"samples": [["192979"]], "type": "string"}}, "type": "object"}, "pronouns": {"samples": [["he/him"]], "type": "string"}}, "type": "object"}, "role": {"samples": ["admin"], "type": "string"}, "status": {"samples": ["active"], "type": "string"}, "tenant_id": {"samples": ["tenant_002"], "type": "string"}, "timezone": {"samples": [["America/Chicago"]], "type": "string"}, "updated_at": {"samples": ["2024-05-22T16:41:27.941025+00:..."], "type": "string"}}, "type": "object"}}
Natural Language Query
I need the list of assignments in tenant_001 that allow only 1 attempt and have plagiarism checking enabled, along with how many submissions have a similarity score above 0.3; sort assignments by the highest average similarity score.
SQL++ Query:
<|im_end|>
<|im_start|>assistant
<think>
1. Understand the Question
- Core intent: retrieval + aggregation per assignment, with filtering and sorting by a computed aggregate.
- Entities: assignments, submissions.
- Filters:
- tenant is tenant_001
- assignments allow only 1 attempt
- plagiarism checking enabled
- Aggregations needed per assignment:
- count of submissions where similarity_score > 0.3
- average similarity_score considering only those > 0.3
- Sort: by highest average similarity score (descending).
2. Identify Relevant Collections
- Primary collection: `training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`assignments` (we want a list of assignments).
- Secondary collection: `training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`submissions` (to compute similarity-based metrics per assignment).
- Need document id for assignments to match submissions.assignment_id → use META(a).id.
3. Map Fields to Collections
- From assignments (alias a):
- tenant filter: a.tenant_id
- attempt rule: a.settings.allowed_attempts
- plagiarism flag: a.settings.plagiarism_check.enabled
- output fields: a.course_id, a.title
- assignment identifier: META(a).id
- From submissions (alias s):
- join/tenant alignment: s.tenant_id
- join key: s.assignment_id
- similarity score: s.integrity.similarity_score
4. Analyze Arrays and Nesting
- No arrays need to be iterated for this request.
- Only nested objects:
- assignments.settings.allowed_attempts
- assignments.settings.plagiarism_check.enabled
- submissions.integrity.similarity_score
5. Decide Join / Nest Strategy
- Need per-assignment aggregates over related submissions.
- Use LEFT JOIN so assignments with zero submissions still appear.
- Join condition should include tenant boundary and assignment id match:
- s.tenant_id = a.tenant_id
- s.assignment_id = META(a).id
6. Determine Filters and Aggregations
- WHERE filters apply to assignments:
- a.tenant_id = "tenant_001"
- a.settings.allowed_attempts = 1
- a.settings.plagiarism_check.enabled = true
- Aggregations:
- Count submissions with similarity_score > 0.3:
- SUM(CASE WHEN s.integrity.similarity_score > 0.3 THEN 1 ELSE 0 END)
- Average similarity_score for only those above 0.3:
- AVG(CASE WHEN s.integrity.similarity_score > 0.3 THEN s.integrity.similarity_score ELSE NULL END)
- returning NULL for non-qualifying rows ensures AVG only considers qualifying submissions.
- GROUP BY must include all non-aggregated SELECT items:
- META(a).id, a.course_id, a.title
7. Decide Output Shape
- Output: assignment_id, course_id, title, count of high-similarity submissions, average high-similarity score.
- Sorting: ORDER BY avg_similarity_gt_0_3 DESC.
8. Validate SQL++ Constraints
- Use fully qualified collection names.
- Use META(a).id because alias a is defined.
- Ensure join predicates reference correct aliases.
- Ensure GROUP BY includes META(a).id, a.course_id, a.title.
- Use double quotes for string literal "tenant_001".
</think>
```sql++
SELECT META(a).id AS assignment_id, a.course_id, a.title, SUM(CASE WHEN s.integrity.similarity_score > 0.3 THEN 1 ELSE 0 END) AS submissions_similarity_gt_0_3, AVG(CASE WHEN s.integrity.similarity_score > 0.3 THEN s.integrity.similarity_score ELSE NULL END) AS avg_similarity_gt_0_3 FROM `training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`assignments` AS a LEFT JOIN `training_bucket`.`Educational_Learning_Management_Ecosystem_scope`.`submissions` AS s ON s.tenant_id = a.tenant_id AND s.assignment_id = META(a).id WHERE a.tenant_id = "tenant_001" AND a.settings.allowed_attempts = 1 AND a.settings.plagiarism_check.enabled = true GROUP BY META(a).id, a.course_id, a.title ORDER BY avg_similarity_gt_0_3 DESC
<|im_end|>
- Downloads last month
- 49
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for jastorj/snowflake_arctic_text2sql_r1_7b-nl2sqlpp-16bit-v1-cw-15K
Base model
Qwen/Qwen2.5-7B
Finetuned
Qwen/Qwen2.5-Coder-7B
Finetuned
Qwen/Qwen2.5-Coder-7B-Instruct
Finetuned
Snowflake/Arctic-Text2SQL-R1-7B