【单调队列】【后缀数组】hdu 6988 Display Substring

作者: wjyyy 分类: 单调队列/单调栈,字符串,解题报告 发布时间: 2021-07-31 00:43

点击量:66

hdu 多校训练题。字符串的开始。

Description

When we display a string on a LED screen, there may be different energy consumptions for displaying different characters. Given a string $S$ with length $n$ consisting of lowercase English letters and the energy consumption of each letter. Assume that the energy consumption of a string is the sum of the energy consumptions of all its characters. Please find the $k$-th smallest energy consumption of all the different substrings of $S$.

Note that a substring of a string $S$ is a contiguous subsequence of $S$. We say two strings $S$ and $T$ are different if the lengths of them are $\textbf{different}$ or there exists at least one position $i$ satisfying $S[i]\ne T[i]$.

Input

The first line of the input contains an integer $T (1≤T≤2×10^3)$ — the number of test cases.

The first line of each test case contains two integers $n (1≤n≤10^5)$ and $k (1≤k≤\frac{n(n+1)}2)$.

The second line of each test case contains a string $S$ with length $n$ consisting of lowercase English letters.

The third line of each test case contains $26$ integers $c_a,c_b,…,c_z (1≤c_\alpha≤100)$ — the energy consumption of each letter.

It is guaranteed that the sum of $n$ among all test cases does not exceed $8×10^5$.

Output

For each test case, output a single integer in a separate line — the $k$-th smallest energy consumption, or $−1$ if there is no answer.

Input sample

2
5 5
ababc
3 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
5 15
ababc
3 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Output sample

4
-1

题意:

给定一个字符串,每个字母有一个权值。问字符串的所有子串去重后,权值第 $k$ 大的是多少。

题解:

1004:Display Substring

(https://acm.hdu.edu.cn/showproblem.php?pid=6988)

知识点:后缀数组、二分答案、单调队列

用后缀数组处理出相同的子串,然后二分答案找出恰好第 $k$ 大的。

注意事项: $k$ 需要开 long long、Windows 环境下剪贴板无法粘贴长为 $10^5$ 的字符串。

用后缀数组求相同子串

对于每个贡献答案的位置,它的 Height 是多少,就说明有多少个它的前缀已经出现。由此去重即可。

二分答案

二分答案 $mid$,对每个左端点找到最后一个区间和不超过 $mid$ 的右端点,前缀和处理即可。

找右端点的过程可以二分,也可以用单调队列,用二分后总复杂度为 $O(n\log n\log\sum c)$。

单调队列的时间复杂度为 $O(n(\log n+\log \sum c))$。

Code:

#include<cstdio>
#include<cstring>
#include<algorithm>
using namespace std;
char s[200100];
int rk[200100],sa[200100],tax[200100],tp[200100];
int n,m,hei[200100];
void rsort()
{
    for(int i=0;i<=m;++i)
        tax[i]=0;
    for(int i=1;i<=n;++i)
        tax[rk[tp[i]]]++;
    for(int i=1;i<=m;++i)
        tax[i]+=tax[i-1];
    for(int i=n;i>=1;--i)
        sa[tax[rk[tp[i]]]--]=tp[i];
}
int val[30],sum[400100];
long long check(int x)
{
    int t=1;
    long long ans=0;
    for(int i=1;i<=n;++i)
    {
        while(t<=n&&sum[t]-sum[i-1]<=x)
            ++t;
        int tmp=(t-i-hei[rk[i]]);
        ans+=tmp>0?tmp:0;
    }
    return ans;
}
int main()
{
    //freopen("a.txt","r",stdin);
    int T=1;
    scanf("%d",&T);
    while(T--)
    {
        long long K;
        scanf("%d%lld",&n,&K);
        scanf("%s",s+1);
        for(int i=0;i<26;++i)
            scanf("%d",val+i);
        for(int i=1;i<=n;++i)
        {
            sum[i]=sum[i-1]+val[s[i]-'a'];
            rk[i]=s[i];
            tp[i]=i;
        }

        m=127;rsort();
        for(int w=1,p=1,i;p<n;w+=w,m=p)
        {
            for(p=0,i=n-w+1;i<=n;++i)
                tp[++p]=i;
            for(i=1;i<=n;++i)
                if(sa[i]>w)
                    tp[++p]=sa[i]-w;
            rsort();
            swap(rk,tp);
            rk[sa[1]]=p=1;
            for(int i=2;i<=n;++i)
                rk[sa[i]]=(tp[sa[i]]==tp[sa[i-1]]&&tp[sa[i]+w]==tp[sa[i-1]+w])?p:++p;
        }

        int k=0;
        for(int i=1;i<=n;hei[rk[i++]]=k)
        {
            k=k>0?k-1:0;
            for(int j=sa[rk[i]-1];s[i+k]==s[j+k];++k);
        }
        int l=0,r=10000001;
        while(l<r)
        {
            int mid=(l+r)>>1;
            if(check(mid)<K)
                l=mid+1;
            else
                r=mid;
        }
        printf("%d\n",l==10000001?-1:l);
    }
    return 0;
}

说点什么

avatar
  Subscribe  
提醒
/* */