Inverted string lookup
Transforms numbers to string using a vocabulary. This is roughly the inverse of String lookup.
Takes two inputs: input and vocabulary. The data type for the vocabulary must be string, with one dimension.
For <Input mode> Integer, the supported input data types are Int8, Int16, Int32, Int64 and Uint8. Also, any shape is supported, and the output shape will be the same as the input.
For <Input mode> One-hot, the supported input data types are Int8, Int16, Int32, Int64, Uint8, Float16, Float32, Float64 and Bool. In case of Bool, a value of false is considered 0 and true is considered 1. Also, any shape is supported, but the size of the last dimension must equal the vocabulary size plus one. The output shape will be the same as the input, but with the last dimension removed.
The output data type will be String.
By clicking the node the following parameters can be configured on the right panel:
- Input mode: One of Integer or One-hot. If Integer, then each number in the input is converted to a string. The number 1 is converted to the first string in the vocabuary, 2 to the second, and so on. The number 0 is converted to the <Out-of-vocabulary token>. If One-hot, the input should have shape (*, V + 1), where V is the vocabulary length. Then, each of these vectors of length V + 1 is converted to a string by checking the index of the first non-zero position. If the first element is non-zero, then the vector is converted to the <Out-of-vocabulary token>. If the second element is non-zero, the vector is converted to the first string in the vocabulary, if the third element is non-zero, the vector is converted to the second string in the vocabulary, and so on. If all places are zero, the vector is converted to the <Out-of-vocabulary token>. For example, the vector [1, 0, 0] is converted to the <Out-of-vocabulary token>, the vector [0, 1, 0] to the first string in the vocabulary, and [0, 0, 1] to the second string in the vocabulary.
- Out-of-vocabulary token: A string which the input is converted to if it is zero or not recognized.