GPT-2 Tokenization
430 tokens
<at-dialog title="vm.title" on-close="vm.onClose">←↩
<at-form state="vm.form" autocomplete="off" id="external_test_form">←↩
<at-input-group col="12" tab="20" state="vm.form.inputs" form-id="external_test"></at-
input-group>←↩
<at-action-group col="12" pos="right">←↩
<at-action-button←↩
variant="tertiary"←↩
ng-click="vm.onClose()"←↩
>←↩
::vm.strings.get(’CLOSE’) ←↩
</at-action-button>←↩
<at-action-button←↩
variant="primary"←↩
ng-click="vm.onSubmit()"←↩
ng-disabled="!vm.form.isValid || vm.form.disabled"←↩
>←↩
::vm.strings.get(’RUN’) ←↩
</at-action-button>←↩
</at-action-group>←↩
</at-form>←↩
</at-dialog>←↩
GPT-NeoX-20B Tokenization
257 tokens
<at-dialog title="vm.title" on-close="vm.onClose">←↩
<at-form state="vm.form" autocomplete="off" id="external_test_form">←↩
<at-input-group col="12" tab="20" state="vm.form.inputs" form-id="external_test"></at-
input-group>←↩
<at-action-group col="12" pos="right">←↩
<at-action-button←↩
variant="tertiary"←↩
ng-click="vm.onClose()"←↩
>←↩
::vm.strings.get(’CLOSE’) ←↩
</at-action-button>←↩
<at-action-button←↩
variant="primary"←↩
ng-click="vm.onSubmit()"←↩
ng-disabled="!vm.form.isValid || vm.form.disabled"←↩
>←↩
::vm.strings.get(’RUN’) ←↩
</at-action-button>←↩
</at-action-group>←↩
</at-form>←↩
</at-dialog>←↩
Figure 15: Pile (GitHub) Tokenization Example135
GPT-2 Tokenization
178 tokens
Theresa May is expected to appoint an EU ambassador who “ believes in Brexit” in the wake of the
current Brussels representative’s decision to quit after being cut adrift by Downing Street.
←↩
←↩
Sir Ivan Rogers on Tuesday announced his resignation as Britain’ s ambassador in Brussels after
it was made clear Mrs May and her senior team had “ lost confidence” in him over his “ pessim
istic” view of Brexit.←↩
←↩
Government sources made clear that Sir Ivan had “ jumped before he was pushed” and that Number
10 believed his negative view of Brexit meant that he could not lead the negotiations after the
Prime Minister triggers Article 50.←↩
←↩
In a 1,400-word resignation letter to his staff leaked on Tuesday night, Sir Ivan launched a
thinly-veiled attack on the "muddled thinking" in Mrs May’s Government.
GPT-NeoX-20B Tokenization
170 tokens
Theresa May is expected to appoint an EU ambassador who “believes in Brexit” in the wake of the
current Brussels representative’s decision to quit after being cut adrift by Downing Street.
←↩
←↩
Sir Ivan Rogers on Tuesday announced his resignation as Britain’s ambassador in Brussels after
it was made clear Mrs May and her senior team had “lost confidence” in him over his “pessim
istic” view of Brexit.←↩
←↩
Government sources made clear that Sir Ivan had “jumped before he was pushed” and that Number
10 believed his negative view of Brexit meant that he could not lead the negotiations after the
Prime Minister triggers Article 50.←↩
←↩
In a 1,400-word resignation letter to his staff leaked on Tuesday night, Sir Ivan launched a
thinly-veiled attack on the "muddled thinking" in Mrs May’s Government.
Figure 16: Pile (OpenWebText2) Tokenization Example
GPT-2 Tokenization
268 tokens
Carotid endarterectomy: operative risks, recurrent stenosis, and long-term stroke rates in a
modern series.←↩
To determine whether carotid endarterectomy (CEA) safely and effectively maintained a durable
reduction in stroke complications over an extended period, we reviewed our data on 478
consecutive patients who underwent 544 CEA’s since 1976. Follow-up was complete in 83% of
patients (mean 44 months). There were 7 early deaths (1.3%), only 1 stroke related (0.2%). Peri
operative stroke rates (overall 2.9%) varied according to operative indications: asymptomatic, 1
.4%; transient ischemic attacks (TIA)/amaurosis fugax (AF), 1.3%; nonhemispheric symptoms (NH),
4.9%; and prior stroke (CVA), 7.1%. Five and 10-year stroke-free rates were 96% and 92% in the
asymptomatic group, 93% and 87% in the TIA/AF group, 92% and 92% in the NH group, and 80% and
73% in the CVA group. Late ipsilateral strokes occurred infrequently (8 patients, 1.7%). Late
deaths were primarily cardiac related (51.3%). Stro
GPT-NeoX-20B Tokenization
250 tokens
Carotid endarterectomy: operative risks, recurrent stenosis, and long-term stroke rates in a
modern series.←↩
To determine whether carotid endarterectomy (CEA) safely and effectively maintained a durable
reduction in stroke complications over an extended period, we reviewed our data on 478
consecutive patients who underwent 544 CEA’s since 1976. Follow-up was complete in 83% of
patients (mean 44 months). There were 7 early deaths (1.3%), only 1 stroke related (0.2%). Peri
operative stroke rates (overall 2.9%) varied according to operative indications: asymptomatic, 1
.4%; transient ischemic attacks (TIA)/amaurosis fugax (AF), 1.3%; nonhemispheric symptoms (NH),
4.9%; and prior stroke (CVA), 7.1%. Five and 10-year stroke-free rates were 96% and 92% in the
asymptomatic group, 93% and 87% in the TIA/AF group, 92% and 92% in the NH group, and 80% and
73% in the CVA group. Late ipsilateral strokes occurred infrequently (8 patients, 1.7%). Late
deaths were primarily cardiac related (51.3%). Stro
Figure 17: Pile (PubMed Abstracts) Tokenization Example
No comments:
Post a Comment