Join Operation¶
The Join operation joins two tables by keeping tuple pairs that satisfy a natural language condition.
Core Implementation¶
nirvana.ops.join.JoinOperation(user_instruction: str = '', left_on: list[str] = [], right_on: list[str] = [], how: str = 'inner', context: list[dict] | str | None = None, model: str | None = None, tool: Callable | BaseTool | None = None, strategy: Literal['nest', 'block'] = 'nest', limit: int | None = None, rate_limit: int = 16, assertions: list[Callable] | None = [], batch_size: int = 5)
¶
Bases: BaseOperation
Join operator: Join values of two columns against a specific user's instruction.
Source code in nirvana/ops/join.py
Attributes¶
strategy_options = ['nest', 'block']
class-attribute
instance-attribute
¶
prompter = JoinPrompter()
instance-attribute
¶
left_on = left_on
instance-attribute
¶
right_on = right_on
instance-attribute
¶
how = how
instance-attribute
¶
batch_size = batch_size
instance-attribute
¶
dependencies: list[str]
property
¶
generated_fields: list[str]
property
¶
op_kwargs: dict
property
¶
Functions¶
execute(left_data: pd.DataFrame, right_data: pd.DataFrame, **kwargs)
async
¶
Source code in nirvana/ops/join.py
Output Class¶
nirvana.ops.join.JoinOpOutputs(cost: float = 0.0, join_pairs: list[tuple] = list(), left_join_keys: list[int] = list(), right_join_keys: list[int] = list())
dataclass
¶
Bases: BaseOpOutputs
Attributes¶
join_pairs: list[tuple] = field(default_factory=list)
class-attribute
instance-attribute
¶
left_join_keys: list[int] = field(default_factory=list)
class-attribute
instance-attribute
¶
right_join_keys: list[int] = field(default_factory=list)
class-attribute
instance-attribute
¶
Functions¶
Function Wrapper¶
nirvana.ops.join
¶
join_wrapper(left_data: DataFrame, right_data: DataFrame, user_instruction: str, left_on: str, right_on: str, how: str = 'inner', context: list[dict] | str | None = None, model: str | None = None, func: Callable = None, strategy: Literal['nest', 'block'] = 'nest', limit: int | None = None, rate_limit: int = 16, assertions: list[Callable] | None = [], batch_size: int = 5, **kwargs)
A function wrapper for join operation
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_data
|
DataFrame
|
Left dataframe |
required |
right_data
|
DataFrame
|
Right dataframe |
required |
user_instruction
|
str
|
User instruction |
required |
left_on
|
str
|
Left on |
required |
right_on
|
str
|
Right on |
required |
how
|
str
|
How. Defaults to "inner". |
'inner'
|
context
|
list[dict] | str
|
Context. Defaults to None. |
None
|
model
|
str
|
Model. Defaults to None. |
None
|
func
|
Callable
|
User function. Defaults to None. |
None
|
strategy
|
Literal['nest', 'block']
|
Strategy. Defaults to "nest". |
'nest'
|
limit
|
int
|
Maximum number of outputs to produce before stopping. |
None
|
rate_limit
|
int
|
Rate limit. Defaults to 16. |
16
|
assertions
|
list[Callable]
|
Assertions. Defaults to []. |
[]
|
batch_size
|
int
|
Batch size for block join. Defaults to 5. |
5
|
**kwargs
|
Additional keyword arguments for OpenAI Clent. |
{}
|