Load pipe delimited using bq load and python subprocess on Windows

1 year ago

Comment

1 / 1

To load a pipe-delimited file using bq load command in Python's subprocess module on Windows, you can use the following code:

python
import subprocess



# Set the path to the bq command-line tool

bq_path = "C:\\path\\to\\bq.cmd"



# Set the GCP project ID and dataset ID

project_id = "your-project-id"

dataset_id = "your-dataset-id"



# Set the table name and file path

table_name = "your-table-name"

file_path = "C:\\path\\to\\your\\file.txt"



# Set the delimiter for the file

delimiter = "|"



# Construct the bq load command

command = [

    bq_path,

    "load",

    "--project_id=" + project_id,

    "--skip_leading_rows=1",

    "--source_format=CSV",

    "--field_delimiter=" + delimiter,

    dataset_id + "." + table_name,

    file_path,

]



# Execute the bq load command

subprocess.run(command, shell=True)

Make sure to replace the placeholders with your actual values:

bq_path: Set the path to the bq command-line tool. This should point to the bq.cmd script in the Cloud SDK installation directory.
project_id: Replace with your GCP project ID.
dataset_id: Replace with the ID of the BigQuery dataset where you want to load the data.
table_name: Replace with the name of the BigQuery table where you want to load the data.
file_path: Replace with the full path to the pipe-delimited file you want to load.
delimiter: Set the delimiter used in your file. In this case, it is set to "|".

The code constructs a command that invokes bq load with the specified parameters and executes it using subprocess.run().

Note: Ensure that you have the Google Cloud SDK installed and properly configured on your Windows system for the bq command to work.